Httpclient tpl parallel multiple request fastest w

2019-07-21 01:57发布

问题:

I want to download webPages content of url list (10 000 urls).

  1. Is httpCLient the fastest and cleanest way (instead httpwebrequest, or webclient)?
  2. If I want to be fast, Is TPL the best way ?

I'm looking for something like, but really fast and clean (10 000 request) ?

public List<string> GetContentListOfUrlList(List<Uri> uriList, int maxSimultaneousRequest)
    {
        //requesting url by the fastest way

    }

I hope is better like this ;)

EDIT 2 : According to noseratio other post Is the best solution ?

public async Task<List<string>> DownloadAsync(List<Uri> urls, int maxDownloads)
    {
        var concurrentQueue = new ConcurrentQueue<string>();

        using (var semaphore = new SemaphoreSlim(maxDownloads))
        using (var httpClient = new HttpClient())
        {
            var tasks = urls.Select(async (url) =>
            {
                await semaphore.WaitAsync();
                try
                {
                    var data = await httpClient.GetStringAsync(url);
                    concurrentQueue.Enqueue(data);
                }
                finally
                {
                    semaphore.Release();
                }
            });

            await Task.WhenAll(tasks.ToArray());
        }
        return concurrentQueue.ToList();
    }

Questions

  1. configureAwait? Should I use

    var data = await httpClient.GetStringAsync(url).ConfigureAwait(false);

var data = await httpClient.GetStringAsync(url);

  1. ServicePointManager.DefaultConnectionLimit? Should I change this property as well?

回答1:

There is a ParallelOptions.MaxDegreeOfParallelism Property which specifies the maximum number of concurrent operations:

Parallel.ForEach(list, 
        new ParallelOptions { MaxDegreeOfParallism = 4 }, 
        DownloadPage);

Reference: MaxDegreeOfParallism