Download is not resuming using Curl C API

2019-04-17 16:15发布

问题:

  1. I am trying to resume a download that has failed due to internet failure. The function I am using to check the success of a curl download is:

    curl_multi_info_read
    

    This function returns the proper error code (CURLE_COULDNT_CONNECT) when it is called the first time when internet has lost. If I try to call it again, it returns NULL pointer which means no messages. Actually, I am using the return error code to check whether internet connection is present or not. This is troubling me as it doesn't return any error code on its second call if there is no internet. Can any one please tell me how to make use of this function to check return code, because this error code (CURLE_COULDNT_CONNECT) is very important to me in checking status of internet and accordingly to resume download from where it stopped when I got back the connection ....

  2. In order to resume download I am using

    curl_easy_setopt (curl, CURLOPT_RESUME_FROM, InternalOffset);
    

    I am calling this function to set option each time I get a lost internet connection so that the download can resume when internet connection is back ...


Notes to Daniel Stenberg:

Here are some details about platform and libcurl version:

  • curl version - libcurl 7.21.6
  • platform - Linux (Ubuntu)

Comments:

  1. Yes. Your view is right. I removed the easy handle from stack, added again to multi handle by setting new option (curl_easy_setopt(curl, CURLOPT_RESUME_FROM, InternalOffset)) and finally I did multi perform. It returned proper error if their is no internet connection. My question is: Do I need to repeat the above steps each time when I lost internet connection to get proper error? If I don't do these steps, will curl_multi_info_read function always return NULL.

  2. One more observation I made is download starts resuming as and when internet connection is back. It starts downloading from the point it where it stopped previously. This has come to me as a surpris . Is curl internally taking care of resuming the download when it gets back the internet. If Is this right? Do I really need to take care resuming the download or leave to curl as it handles it properly?

回答1:

You might need to provide more infos.

Eg: you don't explicitly say whether you're using the multi interface or the easy interface.. & maybe mention what platform you're working on & what libcurl you are using etc.

Below are minimal curl easy and multi tests against libcurl/7.21.6.
I have happily yanked out the network cables, stopped http servers & so on ~ it seems to cope ok.

These might help you:

curl_easy_setopt(curl, CURLOPT_LOW_SPEED_LIMIT, dl_lowspeed_bytes); //bytes/sec
curl_easy_setopt(curl, CURLOPT_LOW_SPEED_TIME, dl_lowspeed_time); //seconds
curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L);


NB: you have to work quite hard to make curl fall over when the connection drops out. this is by design. but comes as a surprise to some.

[Edit:]
I doubt you would want to be using CURLOPT_TIMEOUT. This would timeout the transfer. If your d/l is large then it will almost certainly take longer than you may be prepared to wait to find out if there's something up with your network connection ~> the timeout would get hit. By contrast, the CURLOPT_LOW_SPEED_TIME timeout may never get hit, even atfter hours of elapsed transfer time.


curltest_easy.c:

/*----------------------------------------------------
curltest_easy.c 
WARNING: for test purposes only ~ 
*/
#include <stdio.h>
#include <unistd.h>
#include <curl/curl.h>
#include <curl/types.h>
#include <curl/easy.h>
#include <sys/stat.h>



static int dl_progress(void *clientp,double dltotal,double dlnow,double ultotal,double ulnow)
{
    if (dlnow && dltotal)
        printf("dl:%3.0f%%\r",100*dlnow/dltotal); //shenzi prog-mon 
    fflush(stdout);    
    return 0;
}

static size_t dl_write(void *buffer, size_t size, size_t nmemb, void *stream)
{    
    return fwrite(buffer, size, nmemb, (FILE*)stream); 
}


int do_dl(void) 
{
    CURL *curl;
    FILE *fp;
    CURLcode curl_retval;
    long http_response;
    double dl_size;
    int retval=0;
    long dl_lowspeed_bytes=1000; //1K
    long dl_lowspeed_time=10; //sec        
    /*put something biG here, preferably on a server that you can switch off at will ;) */
    char url[] = {"http://fc00.deviantart.net/fs26/f/2008/134/1/a/Dragon_VII_by_NegativeFeedback.swf"};
    char filename[]={"blah.dl"};

    struct stat st={0};    
    if (!stat(filename, &st));    
    printf("st.st_size:[%ld]\n", st.st_size);  


    if(!(fp=fopen(filename, "ab"))) /*append binary*/
      return 1; 


    curl_global_init(CURL_GLOBAL_DEFAULT);   
    curl = curl_easy_init();

    if (curl) 
    {   
        //http://linux.die.net/man/3/curl_easy_setopt
        curl_easy_setopt(curl, CURLOPT_URL, url);

        /*callbacks*/
        curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, dl_write);
        curl_easy_setopt(curl, CURLOPT_PROGRESSFUNCTION, dl_progress);
        curl_easy_setopt(curl, CURLOPT_NOPROGRESS, 0);

        /*curl will keep running -so you have the freedom to recover 
        from network disconnects etc in your own way without
        distrubing the curl task in hand. ** this is by design :p ** */ 
        //curl_easy_setopt(curl, CURLOPT_TIMEOUT, 60);          
        //curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, 30);
        /*set up min download speed threshold & time endured before aborting*/
        curl_easy_setopt(curl, CURLOPT_LOW_SPEED_LIMIT, dl_lowspeed_bytes); //bytes/sec
        curl_easy_setopt(curl, CURLOPT_LOW_SPEED_TIME, dl_lowspeed_time); //seconds while below low spped limit before aborting


        curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
        curl_easy_setopt(curl, CURLOPT_RESUME_FROM,st.st_size);

        /*uncomment this to get curl to tell you what its up to*/
        //curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L);


        if(CURLE_OK != (curl_retval=curl_easy_perform(curl)))
        {                      
            printf("curl_retval:[%d]\n", curl_retval);
            switch(curl_retval) 
            {
                //Transferred a partial file
                case CURLE_WRITE_ERROR: //can be due to a dropped connection
                break;

                //all defined in curl/curl.h 

                default: //suggest quitting on unhandled error
                retval=0;
            };    


            curl_easy_getinfo(curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, &dl_size);
            printf("CURLINFO_CONTENT_LENGTH_DOWNLOAD:%f\n", dl_size);


            curl_retval=curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &http_response);

            //see: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
            printf("CURLINFO_RESPONSE_CODE:%ld\n", http_response);

            switch(http_response)
            {
            case 0: //eg connection down  from kick-off ~suggest retrying till some max limit
            break;

            case 200: //yay we at least got to our url
            break;

            case 206:
            case 416: //http://www.checkupdown.com/status/E416.html
            printf("ouch! you might want to handle this & others\n"); 

            default: //suggest quitting on an unhandled error
            retval=0;
            };            
        }
        else
        {
            printf("our work here is done ;)\n");
            retval=2;
        }


        if (fp)
            fclose(fp);

        if (curl)
            curl_easy_cleanup(curl);
    }

    printf("retval [%d]\n", retval);
    return retval;
}


int main(void) 
{
    while (!do_dl())
    {
        usleep(5000);
    }

    return 0;
}

/* notes ----

$sudo apt-get install libcurl4-gnutls-dev
$ curl-config --libs
-L/usr/lib/i386-linux-gnu -lcurl -Wl,-Bsymbolic-functions

#oook. lets do it:
$ gcc -o curltest_easy curltest_easy.c -L/usr/lib/i386-linux-gnu -lcurl -Wl,-Bsymbolic-functions
$ ./curltest
*/



curltest_multi.c:

/*----------------------------------------------------
curltest_mult1.c
WARNING: for test purposes only ~
*/
#include <stdio.h>
#include <unistd.h>
#include <curl/curl.h>
#include <curl/types.h>
#include <curl/easy.h>
#include <sys/stat.h>

typedef struct S_dl_byte_data
{
    double new_bytes_received;  //from the latest request
    double existing_filesize;
} dl_byte_data, *pdl_byte_data;

static int dl_progress(pdl_byte_data pdata,double dltotal,double dlnow,double ultotal,double ulnow)
{
    /*dltotal := hacky way of getting the Content-Length ~ less hacky would be to first
    do a HEAD request & then curl_easy_getinfo with CURLINFO_CONTENT_LENGTH_DOWNLOAD*/
    if (dltotal && dlnow)
    {
        pdata->new_bytes_received=dlnow;
        dltotal+=pdata->existing_filesize;
        dlnow+=pdata->existing_filesize;
        printf(" dl:%3.0f%% total:%.0f received:%.0f\r",100*dlnow/dltotal, dltotal, dlnow); //shenzi prog-mon
        fflush(stdout);
    }
    return 0;
}

static size_t dl_write(void *buffer, size_t size, size_t nmemb, void *stream)
{
    return fwrite(buffer, size, nmemb, (FILE*)stream);
}

////////////////////////
int do_dl(void)
{
    CURLM *multi_handle;
    CURL *curl;
    FILE *fp;
    CURLcode curl_retval;
    int retval=0;
    int handle_count=0;
    double dl_bytes_remaining, dl_bytes_received;
    dl_byte_data st_dldata={0};
    char curl_error_buf[CURL_ERROR_SIZE]={"meh"};
    long dl_lowspeed_bytes=1000, dl_lowspeed_time=10; /* 1KBs for 10 secs*/

    /*put something biG here, preferably on a server that you can switch off at will ;) */
    char url[] = {"http://fc00.deviantart.net/fs26/f/2008/134/1/a/Dragon_VII_by_NegativeFeedback.swf"};

    char outfilename[]={"blah.swf"}, filename[]={"blah.dl"};
    struct stat st={0};


    if (!(fp=fopen(filename, "ab")) || -1==fstat(fileno(fp), &st)) //append binary
      return -1;

    if (curl_global_init(CURL_GLOBAL_DEFAULT))
      return -2;

    if (!(multi_handle = curl_multi_init()))
      return -3;

    if (!(curl = curl_easy_init()))
      return -4;


    st_dldata.new_bytes_received=st_dldata.existing_filesize=st.st_size;

    //http://curl.haxx.se/libcurl/c/curl_easy_setopt.html
    curl_easy_setopt(curl, CURLOPT_URL, url);

    /*callbacks*/
    curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, dl_write);
    curl_easy_setopt(curl, CURLOPT_PROGRESSFUNCTION, dl_progress);
    curl_easy_setopt(curl, CURLOPT_PROGRESSDATA, &st_dldata);
    curl_easy_setopt(curl, CURLOPT_NOPROGRESS, 0);

    /*curl will keep running -so you have the freedom to recover from network disconnects etc
    in your own way without distrubing the curl task in hand. ** this is by design :p **
    The follwoing sets up min download speed threshold & time endured before aborting*/
    curl_easy_setopt(curl, CURLOPT_LOW_SPEED_LIMIT, dl_lowspeed_bytes); //bytes/sec
    curl_easy_setopt(curl, CURLOPT_LOW_SPEED_TIME, dl_lowspeed_time); //seconds while below low spped limit before aborting
    //alternatively these are available in libcurl 7.25
    //curl_easy_setopt(curl, CURLOPT_TCP_KEEPALIVE,1L);
    //curl_easy_setopt(curl, CURLOPT_TCP_KEEPIDLE,10);
    //curl_easy_setopt(curl, CURLOPT_TCP_KEEPINTVL,10);

    curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);

    /*uncomment this to get curl to tell you what its up to*/
    //curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L);

    curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, curl_error_buf);


    do
    {
        if (st_dldata.new_bytes_received) //set the new range for the partial transfer if we have previously received some bytes 
        {
            printf("resuming d/l..\n");
            fflush(fp);
            //get the new filesize & sanity check for file; on error quit outer do-loop & return to main
            if (-1==(retval=fstat(fileno(fp), &st)) || !(st_dldata.existing_filesize=st.st_size)) break; 
            //see also: CURLOPT_RANGE for passing a string with our own X-Y range
            curl_easy_setopt(curl, CURLOPT_RESUME_FROM, st.st_size);
            st_dldata.new_bytes_received=0;
        }
        printf("\n\nbytes already received:[%.0f]\n", st_dldata.existing_filesize);

        //re-use the curl handle again & again & again & again... lol
        curl_multi_add_handle(multi_handle, curl);

        do //curl_multi_perform event-loop
        {
            CURLMsg *pMsg;
            int msgs_in_queue;

            while (CURLM_CALL_MULTI_PERFORM == curl_multi_perform(multi_handle, &handle_count));

            //check for any mesages regardless of handle count
            while(pMsg=curl_multi_info_read(multi_handle, &msgs_in_queue))
            {
                long http_response;

                printf("\nmsgs_in_queue:[%d]\n",msgs_in_queue);
                if (CURLMSG_DONE != pMsg->msg)
                {
                    fprintf(stderr,"CURLMSG_DONE != pMsg->msg:[%d]\n", pMsg->msg);
                }
                else
                {
                    printf("pMsg->data.result:[%d] meaning:[%s]\n",pMsg->data.result,curl_easy_strerror(pMsg->data.result));
                    if (CURLE_OK != pMsg->data.result) printf("curl_error_buf:[%s]\n", curl_error_buf);
                    switch(pMsg->data.result)
                    {
                    case CURLE_OK: ///////////////////////////////////////////////////////////////////////////////////////
                    printf("CURLE_OK: ");
                    curl_easy_getinfo(pMsg->easy_handle, CURLINFO_CONTENT_LENGTH_DOWNLOAD, &dl_bytes_remaining);
                    curl_easy_getinfo(pMsg->easy_handle, CURLINFO_SIZE_DOWNLOAD, &dl_bytes_received);
                    if (dl_bytes_remaining == dl_bytes_received)
                    {
                        printf("our work here is done ;)\n");
                        rename(filename, outfilename);
                        retval=1;
                    }
                    else
                    {
                        printf("ouch! st_dldata.new_bytes_received[%f]\n",st_dldata.new_bytes_received);
                        printf("ouch! dl_bytes_received[%f] dl_bytes_remaining[%f]\n",dl_bytes_received,dl_bytes_remaining);
                        retval=dl_bytes_received < dl_bytes_remaining ? 0 : -5;
                    }
                    break; /////////////////////////////////////////////////////////////////////////////////////////////////

                    case CURLE_COULDNT_CONNECT:      //no network connectivity ?
                    case CURLE_OPERATION_TIMEDOUT:   //cos of CURLOPT_LOW_SPEED_TIME
                    case CURLE_COULDNT_RESOLVE_HOST: //host/DNS down ?
                    printf("CURMESSAGE switch handle_count:[%d]\n",handle_count);
                    break; //we'll keep trying

                    default://see: http://curl.haxx.se/libcurl/c/libcurl-errors.html
                    handle_count=0;
                    retval=-5;
                    };


                    //see: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
                    curl_retval=curl_easy_getinfo(pMsg->easy_handle, CURLINFO_RESPONSE_CODE, &http_response);
                    printf("CURLINFO_RESPONSE_CODE HTTP:[%ld]\n", http_response);
                    switch(http_response)
                    {
                    case 0:   //eg connection down  from kick-off ~suggest retrying till some max limit
                    case 200: //yay we at least got to our url
                    case 206: //Partial Content
                    break;

                    case 416:
                    //cannot d/l range ~ either cos no server support
                    //or cos we're asking for an invalid range ~ie: we already d/ld the file
                    printf("HTTP416: either the d/l is already complete or the http server cannot d/l a range\n");
                    retval=2;

                    default: //suggest quitting on an unhandled error
                    handle_count=0;
                    retval=-6;
                    };
                }
            }

            if (handle_count) //select on any active handles
            {
                fd_set fd_read={0}, fd_write={0}, fd_excep={0};
                struct timeval timeout={5,0};
                int select_retval;
                int fd_max;

                curl_multi_fdset(multi_handle, &fd_read, &fd_write, &fd_excep, &fd_max);
                if (-1 == (select_retval=select(fd_max+1, &fd_read, &fd_write, &fd_excep, &timeout)))
                {
                    //errno shall be set to indicate the error
                    fprintf(stderr, "yikes! select error :(\n");
                    handle_count=0;
                    retval=-7;
                    break;
                }
                else{/*check whatever*/}
            }

        } while (handle_count);

        curl_multi_remove_handle(multi_handle,curl);
        printf("continue from here?");
        getchar();        
    }
    while(retval==0);

    curl_multi_cleanup(multi_handle);
    curl_easy_cleanup(curl);
    curl_global_cleanup();
    if (fp) fclose(fp);

    return retval;
}

////////////////////////
int main(void)
{
    int retval;
    printf("\n\ncurl_multi d/l test ~curl version:[%s]\n", curl_version());
    while (1!=(retval=do_dl()))
    {
        printf("retval [%d] continue?\n\n", retval);
        printf("continue?");
        getchar();
    }
    printf("\nend of test!\n\n", retval);
    return retval;
}

/* notes ----

$sudo apt-get install libcurl4-gnutls-dev
$curl-config --libs
-L/usr/lib/i386-linux-gnu -lcurl -Wl,-Bsymbolic-functions

#oook. lets do it:
$gcc -o curltest_multi curltest_multi.c -L/usr/lib/i386-linux-gnu -lcurl -Wl,-Bsymbolic-functions
$./curltest_multi

*/

Erm, you might want to remember to delete the blah.dl file before starting a brand new test. the prog deliberately does not, so you can truncate an existing file beforehand for testing ;)

NB: for something like this you maybe should *not just rely on CURLE_COULDNT_CONNECT ~your codes should be mostly error handling lol (;possibly less if your prog is strictly for personal use;)


[Edit:] I have updated the curtest_multi.c to demonstrate easy_handle re-use.

And do note the following quote from the documentaion:

When a single transfer is completed, the easy handle is still left added to the multi stack. You need to first remove the easy handle with curl_multi_remove_handle(3) and then close it with curl_easy_cleanup(3), or possibly set new options to it and add it again with curl_multi_add_handle(3) to start another transfer.

hope this helps ;)