AWAIT multiple file downloads with DownloadDataAsy

2019-09-22 03:21发布

问题:

I have a zip file creator that takes in a String[] of Urls, and returns a zip file with all of the files in the String[]

I figured there would be a number of example of this, but I cannot seem to find an answer to "How to download many files asynchronously and return when done"

How do I download {n} files at once, and return the Dictionary only when all downloads are complete?

private static Dictionary<string, byte[]> ReturnedFileData(IEnumerable<string> urlList)
{
    var returnList = new Dictionary<string, byte[]>();
    using (var client = new WebClient())
    {
        foreach (var url in urlList)
        {
            client.DownloadDataCompleted += (sender1, e1) => returnList.Add(GetFileNameFromUrlString(url), e1.Result);
            client.DownloadDataAsync(new Uri(url));
        }
    }
    return returnList;
}

private static string GetFileNameFromUrlString(string url)
{
    var uri = new Uri(url);
    return System.IO.Path.GetFileName(uri.LocalPath);
}

回答1:

  • First, you tagged your question with async-await without actually using it. There really is no reason anymore to use the old asynchronous paradigms.
  • To wait asynchronously for all concurrent async operation to complete you should use Task.WhenAll which means that you need to keep all the tasks in some construct (i.e. dictionary) before actually extracting their results.
  • At the end, when you have all the results in hand you just create the new result dictionary by parsing the uri into the file name, and extracting the result out of the async tasks.

async Task<Dictionary<string, byte[]>> ReturnFileData(IEnumerable<string> urls)
{
    var dictionary = urls.ToDictionary(
        url => new Uri(url),
        url => new WebClient().DownloadDataTaskAsync(url));

    await Task.WhenAll(dictionary.Values);

    return dictionary.ToDictionary(
        pair => Path.GetFileName(pair.Key.LocalPath),
        pair => pair.Value.Result);
}


回答2:

    public string JUST_return_dataURL_by_URL(string URL, int interval, int max_interval)
    {
        var client = new WebClient(proxy);
        client.Headers = _headers;
        string downloaded_from_URL = "false";       //default - until downloading
        client.DownloadDataCompleted += bytes => 
        {
            Console.WriteLine("Done!");
            string dataURL = Convert.ToBase64String( bytes );
            string filename = Guid.NewGuid().ToString().Trim('{', '}')+".png";
            downloaded_from_URL =
                        "Image Downloaded from " + URL
                    +   "<br>"
                    +   "<a href=\""+dataURL+"\" download=\""+filename+"\">"
                    +       "<img src=\"data:image/png;base64," + dataURL + "\"/>"+filename
                    +   "</a>"
            ;
            return;
        };
        client.DownloadDataAsync(new System.Uri(URL));

        int i = 0;
        do{
        //  Console.WriteLine(
        //      "(interval > 10): "+(interval > 10)
        //      +"\n(downloaded_from_URL == \"false\"): " + (downloaded_from_URL == "false")
        //      +"\ninterval: "+interval
        //  );
            Thread.Sleep(interval);
            i+=interval;
        }
        while( (downloaded_from_URL == "false") && (i < max_interval) );

        return downloaded_from_URL;
    }


回答3:

You'd be wanting the task.WaitAll method...

msdn link

Create each download as a separate task, then pass them as a collection.

A shortcut to this might be to wrap your download method in a task.

Return new Task<downloadresult>(()=>{ method body});

Apologies for vagueness, working on iPad sucks for coding.

EDIT:

Another implementation of this that may be worth considering is wrapping the downloads using the parallel framework.

Since your tasks all do the same thing taking a parameter, you could instead use Parallel.Foreach and wrap that into a single task:

public System.Threading.Tasks.Task<System.Collections.Generic.IDictionary<string, byte[]>> DownloadTask(System.Collections.Generic.IEnumerable<string> urlList)
        {
            return new System.Threading.Tasks.Task<System.Collections.Generic.IDictionary<string, byte[]>>(() =>
            {
                var r = new System.Collections.Concurrent.ConcurrentDictionary<string, byte[]>();
                System.Threading.Tasks.Parallel.ForEach<string>(urlList, (url, s, l) =>
                {
                    using (System.Net.WebClient client = new System.Net.WebClient())
                    {
                        var bytedata = client.DownloadData(url);
                        r.TryAdd(url, bytedata);
                    }
                });


                var results = new System.Collections.Generic.Dictionary<string, byte[]>();
                foreach (var value in r)
                {
                    results.Add(value.Key, value.Value);
                }

                return results;
            });
        }

This leverages a concurrent collection to support parallel access within the method before converting back to IDictionary.

This method returns a task so can be called with an await.

Hope this provides a helpful alternative.