Why shouldn't I use F# asynchronous workflows

2020-05-17 08:39发布

I have been learning F# recently, being particularly interested in its ease of exploiting data parallelism. The data |> Array.map |> Async.Parallel |> Async.RunSynchronously idiom seems very easy to understand and straightforward to use and get real value from.

So why is it that async is not really intended for this? Donald Syme himself says that PLINQ and Futures are probably a better choice. And other answers I've read here agree with that as well as recommending TPL. (PLINQ doesn't seem too much different to the above built-in functions, as long as you're using the F# Powerpack to get the PSeq functions.)

F# and functional languages make a lot of sense for this, and some applications have achieved great success with async parallelism.

So why shouldn't I use async to execute parallel data processes? What am I going to lose by writing parallel async code instead of using PLINQ or TPL?

3条回答
ゆ 、 Hurt°
2楼-- · 2020-05-17 09:24

So why shouldn't I use async to execute parallel data processes?

If you have a tiny number of completely independent non-async tasks and lots of cores then there is nothing wrong with using async to achieve parallelism. However, if your tasks are dependent in any way or you have more tasks than cores or you push the use of async too far into the code then you will be leaving a lot of performance on the table and could do a lot better by choosing a more appropriate foundation for parallel programming.

Note that your example can be written even more elegantly using the TPL from F# though:

Array.Parallel.map f xs

What am I going to lose by writing parallel async code instead of using PLINQ or TPL?

You lose the ability to write cache oblivious code and, consequently, will suffer from lots of cache misses and, therefore, all cores stalling waiting for shared memory which means poor scalability on a multicore.

The TPL is built upon the idea that child tasks should execute on the same core as their parent with a high probability and, therefore, will benefit from reusing the same data because it will be hot in the local CPU cache. There is no such assurance with async.

查看更多
来,给爷笑一个
3楼-- · 2020-05-17 09:25

I always figured it's what TPL, PLinq etc... give you over and above what Async does. (Cancellation mechanisms is the one that comes to mind.) This question has some better answers.

This article hints at a slight performance advantage to TPL, but probably not enough to be significant.

查看更多
孤傲高冷的网名
4楼-- · 2020-05-17 09:34

I wrote an article that re-implements one C# TPL sample using both Task and Async, which also has some comments on the difference between the two. You can find it here and there is also a more advanced async-based version.

Here is a quote from the first article that compares the two options:

The choice between the two possible implementations depends on many factors. Asynchronous workflows were designed specifically for F#, so they more naturally fit with the language. They offer better performance for I/O bound tasks and provide more convenient exception handling. Moreover, the sequential syntax is quite convenient. On the other hand, tasks are optimized for CPU bound calculations and make it easier to access the result of calculation from other places of the application without explicit caching.

查看更多
登录 后发表回答