HttpClient - Send a batch of requests

2019-01-22 16:55发布

I want to iterate a batch of requests, sending each one of them to an external API using HttpClient class.

  foreach (var MyRequest in RequestsBatch)
  {
            try
            {
                HttpClient httpClient = new HttpClient();
                httpClient.Timeout = TimeSpan.FromMilliseconds(5);
                HttpResponseMessage response = await httpClient.PostAsJsonAsync<string>(string.Format("{0}api/GetResponse", endpoint), myRequest);
                JObject resultResponse = await response.Content.ReadAsAsync<JObject>();
            }
            catch (Exception ex)
            {
                continue;
            }
 }

The context here is I need to set a very small timeout value, so in case the response takes more than that time, we simply get the "Task was cancelled" exception and continue iterating.

Now, in the code above, comment these two lines:

                HttpResponseMessage response = await httpClient.PostAsJsonAsync<string>(string.Format("{0}api/GetResponse", endpoint), myRequest);
                resultResponse = await response.Content.ReadAsAsync<JObject>();

The iteration ends very fast. Uncomment them and try again. It takes a lot of time.

I wonder if calling PostAsJsonAsync/ReadAsAsync methods with await takes more time than the timeout value?

Based on the answer below, supposing it will create different threads, we have this method:

  public Task<JObject> GetResponse(string endPoint, JObject request, TimeSpan timeout)
    {
        return Task.Run(async () =>
        {
            try
            {
                HttpClient httpClient = new HttpClient();
                httpClient.Timeout = TimeSpan.FromMilliseconds(5);
                HttpResponseMessage response = await httpClient.PostAsJsonAsync<string>(string.Format("{0}api/GetResponse", endPoint), request).WithTimeout<HttpResponseMessage>(timeout);
                JObject resultResponse = await response.Content.ReadAsAsync<JObject>().WithTimeout<JObject>(timeout);
                return resultResponse;
            }
            catch (Exception ex)
            {
                return new JObject() { new JProperty("ControlledException", "Invalid response. ")};
            }
        });
    }

An exception is raised there and the JObject exception should be returned, very fast, however, if using httpClient methods, even if it raises the exception it takes a lot of time. Is there a behind the scenes processing affecting the Task even if the return value was a simple exception JObject?

If yes, which another approach could be used to send a batch of requests to an API in a very fast way?

2条回答
时光不老,我们不散
2楼-- · 2019-01-22 17:03

It doesn't look like you're actually running a seperate thread for each request. Try something like this:

var taskList = new List<Task<JObject>>();

foreach (var myRequest in RequestsBatch)
{
    taskList.Add(GetResponse(endPoint, myRequest));
}

try
{
    Task.WaitAll(taskList.ToArray());
}
catch (Exception ex)
{
}

public Task<JObject> GetResponse(string endPoint, string myRequest)
{
    return Task.Run(() =>
        {
            HttpClient httpClient = new HttpClient();

            HttpResponseMessage response = httpClient.PostAsJsonAsync<string>(
                 string.Format("{0}api/GetResponse", endpoint), 
                 myRequest, 
                 new CancellationTokenSource(TimeSpan.FromMilliseconds(5)).Token);

            JObject resultResponse = response.Content.ReadAsAsync<JObject>();
        });
}
查看更多
Lonely孤独者°
3楼-- · 2019-01-22 17:07

I agree with the accepted answer in that the key to speeding things up is to run the requests in parallel. But any solution that forces additional threads into the mix by use of Task.Run or Parallel.ForEach is not gaining you any efficiency with I/O bound asynchronous operations. If anything it's hurting.

You can easily get all calls running concurrently while letting the underlying async subsystems decide how many threads are required to complete the tasks as efficiently as possible. Chances are that number is much smaller than the number of concurrent calls, because they don't require any thread at all while they're awaiting a response.

Further, the accepted answer creates a new instance of HttpClient for each call. Don't do that either - bad things can happen.

Here's a modified version of the accepted answer:

var httpClient = new HttpClient {
    Timeout = TimeSpan.FromMilliseconds(5)
};

var taskList = new List<Task<JObject>>();

foreach (var myRequest in RequestsBatch)
{
    // by virtue of not awaiting each call, you've already acheived parallelism
    taskList.Add(GetResponseAsync(endPoint, myRequest));
}

try
{
    // asynchronously wait until all tasks are complete
    await Task.WhenAll(taskList.ToArray());
}
catch (Exception ex)
{
}

async Task<JObject> GetResponseAsync(string endPoint, string myRequest)
{
    // no Task.Run here!
    var response = await httpClient.PostAsJsonAsync<string>(
        string.Format("{0}api/GetResponse", endpoint), 
        myRequest);
    return await response.Content.ReadAsAsync<JObject>();
}
查看更多
登录 后发表回答