TPL DataFlow vs BlockingCollection

2020-06-17 14:18发布

I understand that a BlockingCollection is best suited for a consumer/producer pattern. However, when do I use a ActionBlock from the TPL DataFlow library?

My initial understanding is for IO operations, keep the BlockingCollection while CPU intensive operations are bested suited for an ActionBlock. But I feel like this isn't the whole story... Any additional insight?

2条回答
做自己的国王
2楼-- · 2020-06-17 14:49

TPL Dataflow is better suited for an actor based design. That means that if you want to chain producers and consumers it's much easier with TDF.

Another big plus for TPL dataflow is that it was built with async in mind. You can both produce and consume in a synchronous way and in an async way (and both at the same time), which is very useful. (I mostly produce in a synchronous way and consume in a non-blocking async way).

You can also very easily set a bounded capacity and degree of parallelism.

TL;DR: BlockingCollection is a simple and general tool. TPL Dataflow is much more robust, but can be an overkill or a bad fit for specific problems.

查看更多
forever°为你锁心
3楼-- · 2020-06-17 15:01

Not sure if the repeated use of the word Block is causing confusion here. They are very different things.

You're right, a BlockingCollection is well suited to a producer consumer situation, in that it will block an attempt to read from it until data is available. However, BlockingCollection is not a part of TPL Dataflow. It was introduced in .NET 4.0 as one of the new thread safe collection types.

An ActionBlock, however, is a type of 'Block' defined by TPL Dataflow, and can be used to perform an action. Block, in this sense, more refers to it's use as a part of a data flow.

Data flows, as defined in TPL data flow are made up of blocks, and there are three main types. From the documentation:

The TPL Dataflow Library consists of dataflow blocks, which are data structures that buffer and process data. The TPL defines three kinds of dataflow blocks: source blocks, target blocks, and propagator blocks. A source block acts as a source of data and can be read from. A target block acts as a receiver of data and can be written to. A propagator block acts as both a source block and a target block, and can be read from and written to. The TPL defines the System.Threading.Tasks.Dataflow.ISourceBlock interface to represent sources, System.Threading.Tasks.Dataflow.ITargetBlock to represent targets, and System.Threading.Tasks.Dataflow.IPropagatorBlock to represent propagators. IPropagatorBlock inherits from both ISourceBlock, and TargetBlock. The TPL Dataflow Library provides several predefined dataflow block types that implement the ISourceBlock, ITargetBlock, and IPropagatorBlock interfaces. These dataflow block types are described in this document in the section Predefined Dataflow Block Types.

An ActionBlock is a type of ITargetBlock, which takes an input, performs an action, and then stops.

To answer your first question, I would think that you may use a BlockingCollection when your process is simple. You would use TPL Dataflow when your process is more complicated, and in that case, you probably wouldn't need a BlockingCollection.

There are examples of the Producer-Consumer problem using BlockingCollection here: http://blogs.msdn.com/b/csharpfaq/archive/2010/08/12/blocking-collection-and-the-producer-consumer-problem.aspx?Redirected=true and here: http://programmerfindings.blogspot.co.uk/2012/07/producer-consumer-problem-using-tpl-and.html

Neither of these use Dataflow. There is an example of one using Dataflow here:

http://msdn.microsoft.com/en-us/library/hh228601(v=vs.110).aspx

Plus, I would strongly suggest reading the TPL Dataflow documentation here:

http://msdn.microsoft.com/en-us/library/hh228601(v=vs.110).aspx

if you are implementing anything complex.

查看更多
登录 后发表回答