How to get the fragment part of an URL from libcur

2019-08-15 06:37发布

问题:

I get redirected to a page with address like http://example.com#foo=bar. I want to get foo=bar part of it. The whole thing would be ok too.

I found this thing:

char * url;
curl_easy_getinfo(myHandle, CURLINFO_EFFECTIVE_URL, &url);

I don't know english well to find information myself. Every time I want to find it, I find information on getting the page into string variable.

Code:

std::string readBuffer;
curl_global_init( CURL_GLOBAL_ALL);
CURL * myHandle;
CURLcode result;
myHandle = curl_easy_init();
curl_easy_setopt(myHandle, CURLOPT_COOKIEJAR, "coo.txt");
curl_easy_setopt(myHandle, CURLOPT_COOKIEFILE, "coo.txt");
curl_easy_setopt(myHandle, CURLOPT_URL, "https://www.google.ru/#q=stack");
curl_easy_setopt(myHandle, CURLOPT_WRITEFUNCTION, WriteCallback);
curl_easy_setopt(myHandle, CURLOPT_WRITEDATA, &readBuffer);
curl_easy_setopt(myHandle, CURLOPT_FOLLOWLOCATION, 1L);
result = curl_easy_perform(myHandle);
char * ch_cur_url;
result = curl_easy_getinfo(myHandle, CURLINFO_EFFECTIVE_URL,
        &ch_cur_url);
printf("%s\n", ch_cur_url);

Outputs https://www.google.ru/

When I wanted https://www.google.ru/#q=stack

回答1:

cURL removes the "fragment identifier" from the URL before making a request, as per the bug reports (1, 2). See also this patch. Thus the "fragment identifier" is not available as part of the CURLINFO_EFFECTIVE_URL.

If the "fragment identifier" is returned as part of a redirect (e.g. the Location HTTP header) and you can't get it any other way, then you may use the debug modes to peek on the communications between the cURL and the servers and extract the "fragment identifier" yourself. To that end you'll need to setup either CURLOPT_DEBUGFUNCTION or CURLOPT_HEADERFUNCTION.

P.S. A bit of advise: Googling the relevant information was very easy. First thing I did was to learn the "official" name of the #foo=bar. To get it I visited Wikipedia at URL and was brought to Fragment identifier. After that, Googling with the "curl fragment" netted the relevant parts. If you're looking for something, learn it's proper name.



标签: c++ libcurl