How to speed up task with httpclient

2019-07-15 05:23发布

问题:

I have a processes where I need to make ~100 http api calls to a server and process the results. I've put together this commandexecutor which builds a list of commands and then runs them async. To make about 100 calls and parse the result is taking over 1 minute. 1 request using a browser give me a response in ~100ms. You would think that ~100 calls would be around 10 seconds. I believe that I am doing something wrong and that this should go much faster.

 public static class CommandExecutor
 {
    private static readonly ThreadLocal<List<Command>> CommandsToExecute =
        new ThreadLocal<List<Command>>(() => new List<Command>());
    private static readonly ThreadLocal<List<Task<List<Candidate>>>> Tasks =
        new ThreadLocal<List<Task<List<Candidate>>>>(() => new List<Task<List<Candidate>>>());

    public static void ExecuteLater(Command command)
    {
        CommandsToExecute.Value.Add(command);
    }

    public static void StartExecuting()
    {
        foreach (var command in CommandsToExecute.Value)
        {
            Tasks.Value.Add(Task.Factory.StartNew<List<Candidate>>(command.GetResult));
        }

        Task.WaitAll(Tasks.Value.ToArray());
    }

    public static List<Candidate> Result()
    {
        return Tasks.Value.Where(x => x.Result != null)
                          .SelectMany(x => x.Result)
                          .ToList();
    }
}

The Command that I am passing into this list creates a new httpclient, calls the getasync on that client with a url, converts the string response to an object then hydrates a field.

    protected void Initialize()
    {
        _httpClient = new HttpClient();
        _httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("text/plain"));
    }

    protected override void Execute()
    {
        Initialize();

        var task = _httpClient.GetAsync(string.Format(Url, Input));
        Result = ConvertResponseToObjectAsync(task).Result;
        Result.ForEach(x => x.prop = value);
    }

    private static Task<Model> ConvertResponseToObjectAsync(Task<HttpResponseMessage> task)
    {
        return task.Result.Content.ReadAsAsync<Model>(
           new MediaTypeFormatter[]
           {
                 new Formatter()
           });
    }

Can you pick up on my bottleneck or have any suggestions on how to speed this up.

EDIT making these changes made it down to 4 seconds.

protected override void Execute()
    {
        Initialize();

        _httpClient.GetAsync(string.Format(Url, Input))
        .ContinueWith(httpResponse => ConvertResponseToObjectAsync(httpResponse)
        .ContinueWith(ProcessResult));
    }

    protected void ProcessResult(Task<Model> model)
    {
        Result = model.Result;
        Result.ForEach(x => x.prop = value);
    }

回答1:

Avoid the use of task.Result in ConvertResponseToObjectAsync and then again in Execute. Instead chain these on to the original GetAsync task with ContinueWith.

As it stands today, Result will block execution of the current thread until the other task finishes. However, your threadpool will quickly get backed up by tasks waiting on other tasks that have nowhere to run. Eventually (after waiting for a second), the threadpool will add an additional thread to run and so this will eventually finish, but it's hardly efficient.

As a general principle, you should avoid ever accessing Task.Result except in a task continuation.

As a bonus, you probably don't want to be using ThreadLocalStorage. ThreadLocalStorage stores an instance of the item stored in it on each thread where it is accessed. In this case, it looks like you want a thread-safe but shared form of storage. I would recommend ConcurrentQueue for this sort of thing.



回答2:

Stop creating new HttpClient instances. Everytime you dispose a HttpClient instance it closes the TCP/IP connection. Create one HttpClient instance and re-use it for every request. HttpClient can make multiple requests on multiple different threads at the same time.