I was wondering whether it's true that async
-await
should not be used for "high-CPU" tasks. I saw this claimed in a presentation.
So I guess that would mean something like
Task<int> calculateMillionthPrimeNumber = CalculateMillionthPrimeNumberAsync();
DoIndependentWork();
int p = await calculateMillionthPrimeNumber;
My question is could the above be justified, or if not, is there some other example of making a high-CPU task async?
Let's say your
CalculateMillionthPrimeNumber
was something like the following (not very efficient or ideal in its use ofgoto
but very simple to undertand):Now, there's not useful point here at which this can do something asynchronously. Let's make it a Task-returning method using
async
:The compiler will warn us about that, because there's nowhere for us to
await
anything useful. Really calling this is going to be the same as a slightly more complicated version of callingTask.FromResult(CalculateMillionthPrimeNumber())
. That is to say, it's the same as doing the calculation and then creating an already-completed task that has the calculated number as its result.Now, already-completed tasks aren't always pointless. For example, consider:
This returns an already-completed task when the string is in the cache, and not otherwise, and in that case it will return pretty fast. Other cases are if there is more than one implementation of the same interface and not all implementations can use async I/O.
And likewise an
async
method thatawait
s this method will return an already-completed task or not depending on this. It's actually a pretty great way of just staying on the same thread and doing what needs done when that is possible.But if it's always possible then the only effect is an extra bit of bloat around creating the
Task
object and the state-machine thatasync
uses to implement it.So, pretty pointless. If that was how the version in your question was implemented then
calculateMillionthPrimeNumber
would have hadIsCompleted
returning true right from the beginning. You should have just called the non-async version.Okay, as the implementers of
CalculateMillionthPrimeNumberAsync()
we want to do something more useful for our users. So we do:Okay, now we're not wasting our user's time.
DoIndependentWork()
will do stuff at the same time asCalculateMillionthPrimeNumberAsync()
, and if it it finishes first then theawait
will release that thread.Great!
Only, we haven't really moved the needle that much from the synchronous position. Indeed, especially if
DoIndependentWork()
isn't arduous we may have made it a lot worse. The synchronous way would do everything on one thread, lets call itThread A
. The new way does the calculation onThread B
then either releasesThread A
, then synchronises back in a few possible ways. It's a lot of work, has it gained anything?Well maybe, but the author of
CalculateMillionthPrimeNumberAsync()
can't know that, because the factors that influence that are all in the calling code. The calling code could have doneStartNew
itself, and been better able to fit the synchronisation options to the need when it did so.So, while tasks can be a convenient way of calling cpu-bound code in parallel to another task, methods that do so are not useful. Worse they're deceiving as someone seeing
CalculateMillionthPrimeNumberAsync
could be forgiven for believing that calling it wasn't pointless.There are, in fact, two major uses of async/await. One (and my understanding is that this is one of the primary reasons that it was put into the framework) is to enable the calling thread to do other work while it's waiting for a result. This is mostly for I/O-bound tasks (i.e. tasks where the main "holdup" is some kind of I/O - waiting for a hard drive, server, printer, etc. to respond or complete its task).
As a side note, if you're using async/await in this way, it's important to make sure that you've implemented it in such a way that the calling thread can actually do other work while it's waiting for the result; I've seen plenty of cases where people do stuff like "A waits for B, which waits for C"; this can end up performing no better than if A just called B synchronously and B just called C synchronously (because the calling thread's never allowed to do other work while it's waiting for the results of B and C).
In the case of I/O-bound tasks, there's little point in creating an extra thread just to wait for a result. My usual analogy here is to think of ordering in a restaurant with 10 people in a group. If the first person the waiter asks to order isn't ready yet, the waiter doesn't just wait for him to be ready before he takes anyone else's order, nor does he bring in a second waiter just to wait for the first guy. The best thing to do in this case is to ask the other 9 people in the group for their orders; hopefully, by the time that they've ordered, the first guy will be ready. If not, at least the waiter's still saved some time because he spends less time being idle.
It's also possible to use things like
Task.Run
to do CPU-bound tasks (and this is the second use for this). To follow our analogy above, this is a case where it would be generally useful to have more waiters - e.g. if there were too many tables for a single waiter to service. Really, all that this actually does "behind the scenes" is use the Thread Pool; it's one of several possible constructs to do CPU-bound work (e.g. just putting it "directly" on the Thread Pool, explicitly creating a new thread, or using a Background Worker) so it's a design question which mechanism you end up using.One advantage of
async/await
here is that it can (given the right circumstances) reduce the amount of explicit locking/synchronization logic you have to write manually. Here's a kind of dumb example:Obviously, I'm assuming here that the tasks are completely parallelizable. Note, too, that you could have used the Thread Pool yourself here, but that would be a little less convenient because you'd need some way to figure out yourself whether all of them had completed (rather than just letting the framework figure that out for you). You may also have been able to use a
Parallel.For
loop here.Unless
CalculateMillionthPrimeNumberAsync
constantly usesasync/await
by itself, there is no reason not to let the Task to run heavy CPU work, since it just delegates your method onto ThreadPool's thread.What a ThreadPool thread is and how does it differ from a regular thread is written here.
In short, it just takes the threadpool thread into custody for quite a time (and the number of threadpool threads is limited), so, unless you are taking too many them, there is nothing to worry about.
Yes, that's true.
I would say that it is not justified. In the general case, you should avoid using
Task.Run
to implement methods with asynchronous signatures. Don't expose asynchronous wrappers for synchronous methods. This is to prevent confusion by consumers, particularly on ASP.NET.However, there is nothing wrong with using
Task.Run
to call a synchronous method, e.g., in a UI app. In this way, you can use multithreading (Task.Run
) to keep the UI thread free, and consume it elegantly withawait
: