C - URL encoding

2019-02-22 01:35发布

问题:

Is there a simple way to do URL encode in C? I'm using libcurl but I haven't found a method. Specifically, I need to do percent escapes.

回答1:

curl_escape

which apparently has been superseded by

curl_easy_escape



回答2:

In C based on wikipedia, without having to alloc and free. Make sure output buffer is at least 3X input URL string. Usually you only need to encode up to maybe 4K as URLs tend to be short so just do it on the stack.

char rfc3986[256] = {0};
char html5[256] = {0};

void url_encoder_rfc_tables_init(){

    int i;

    for (i = 0; i < 256; i++){

        rfc3986[i] = isalnum( i) || i == '~' || i == '-' || i == '.' || i == '_' ? i : 0;
        html5[i] = isalnum( i) || i == '*' || i == '-' || i == '.' || i == '_' ? i : (i == ' ') ? '+' : 0;
    }
}

char *url_encode( char *table, unsigned char *s, char *enc){

    for (; *s; s++){

        if (table[*s]) sprintf( enc, "%c", table[*s]);
        else sprintf( enc, "%%%02X", *s);
        while (*++enc);
    }

    return( enc);
}

Use it like this

url_encoder_rfc_tables_init();

url_encode( html5, url, url_encoded);


回答3:

I wrote this to also take care of query string encoding of the space character

Usage: UrlEncode("http://www.example.com/index.html?Hello=World", " :/", buffer, buf_size)

url: The url string to encode. Can be string literal or string array

encode: A zero-terminated string of chars to encode. This is good cause you can at runtime determine how much of the url to encode

buffer: A buffer to hold the new string

size: The size of the buffer

return: Returns the size of the new string if buffer is large enough OR returns the required buffer size if the buffer is not large enough. You can double tap this function if you want to allocate the exact size needed.

    int UrlEncode(char* url, char* encode,  char* buffer, unsigned int size)
    {
        char chars[127] = {0};
        unsigned int length = 0;

        if(!url || !encode || !buffer) return 0;

//Create an array to hold ascii chars, loop through encode string
//and assign to place in array. I used this construct instead of a large if statement for speed.
        while(*encode) chars[*encode++] = *encode;

//Loop through url, if we find an encode char, replace with % and add hex
//as ascii chars. Move buffer up by 2 and track the length needed.
//If we reach the query string (?), move to query string encoding
    URLENCODE_BASE_URL:
        while(size && (*buffer = *url)) {
            if(*url == '?') goto URLENCODE_QUERY_STRING;
            if(chars[*url] && size > 2) {
                *buffer++ = '%';
                itoa(*url, buffer, 16);
                buffer++; size-=2; length+=2;
            }
            url++, buffer++, size--; length++;  
        }
        goto URLENCODE_RETURN;

//Same as above but on spaces (' '), replace with plus ('+') and convert
//to hex ascii. I moved this out into a separate loop for speed.
    URLENCODE_QUERY_STRING:
        while(size && (*buffer = *url)) {
            if(chars[*url] && size > 2) {
                *buffer++ = '%';
                if(*url == ' ') itoa('+', buffer, 16);
                else itoa(*url, buffer, 16);
                buffer++; size-=2; length+=2;
            }
            url++, buffer++, size--; length++;
        }

//Terminate the end of the buffer, and if the buffer wasn't large enough
//calc the rest of the url length and return
    URLENCODE_RETURN:
        *buffer = '\0';
        if(*url)
        while(*url) { if(chars[*url]) length+=2; url++; length++; }
        return length;
    }

This function pretty much handles most (if not all) url encoding you'd need. Best of all - it's really fast!