So, I'm trying to wrap my head around Microsoft's Dataflow library. I've built a very simple pipeline consisting of just two blocks:
var start = new TransformBlock<Foo, Bar>();
var end = new ActionBlock<Bar>();
start.LinkTo(end);
Now I can asynchronously process Foo
instances by calling:
start.SendAsync(new Foo());
What I do not understand is how to do the processing synchronously, when needed. I thought that waiting on SendAsync
would be enough:
start.SendAsync(new Foo()).Wait();
But apparently it returns as soon as item is accepted by first processor in pipeline, and not when item is fully processed. So is there a way to wait until given item was processed by last (end
) block? Apart from passing a WaitHandle
through entire pipeline.
In short that's not supported out of the box in data flow. Essentially what you need to do is to tag the data so you can retrieve it when processing is done. I've written up a way to do this that let's the consumer
await
aJob
as it gets processed by the pipeline. The only concession to pipeline design is that each block take aKeyValuePair<Guid, T>
. This is the basicJobManager
and the post I wrote about it. Note the code in the post is a bit dated and needs some updates but it should get you in the right direction.I ended up using the following pipeline:
Where
Less than ideal, but it works.