Dependency injection with parallel processing

2019-06-22 09:32发布

问题:

I am trying to practice manual dependency injection (no framework yet) to remove tight coupling in my code. This is just for practice - I don't have a specific implementation in mind.

So far simple constructor injection has worked fine.

However I cannot work out how to solve creating a tight coupling when one class must use another within a parallel loop. Example:

public class Processor
{
    private IWorker worker;
    public Processor(IWorker worker)
    {
        this.worker = worker;
    }
    public List<string> DoStuff()
    {
        var list = new List<string>();
        for (int i = 0; i < 10; i++)
        {
            list.Add(worker.GetString());
        }
        return list;
    }

    public List<string> DoStuffInParallel()
    {
        var list = new System.Collections.Concurrent.ConcurrentBag<string>();

        Parallel.For(0, 10, i => 
        { 
            //is there any trivial way to avoid this??
            list.Add(new Worker().GetString());
        });
        return list.ToList();
    }
}

public class Worker : IWorker
{
    public string GetString()
    {
        //pretend this relies on some instance variable, so it not threadsafe
        return "a string";
    }
}

Avoiding the obvious fact that a parallel loop will be slower than a standard loop in the above case, how could i write the Processor.DoStuffInParallel() method to avoid the current hard dependency on the Worker class?

回答1:

One way to decouple this is by injecting a factory, e.g.:

public List<string> DoStuffInParallel(IWorkerFactory factory)
{
    var list = new System.Collections.Concurrent.ConcurrentBag<string>();

    Parallel.For(0, 10, i => 
    { 
        list.Add(factory.Create().GetString());
    });
    return list.ToList();
}

The factory could be an container-owned singleton, and the Create() would need to be thread safe.

Note of course that your tasks can't concurrently mutate the list - you'll need to synchronize access when adding the worker result to the list (apols, missed your ConcurrentBag)- In order to reduce contention on the bag, you might also want to look at one of the Parallel.For overloads with localinit / localFinally to do a local aggregation of results into a per-task list, before synchronizing to the aggregated / overall bag in the localFinally.

Edit
Re: Do I need to inject a factory for ConcurrentBag<String> ? IMO, this is fine to create the ConcurrentBag directly - it is an implementation specific detail, rather than a dependency. e.g. a Single threaded implementation may have implemented this as :

return Enumerable.Range(0, 10)
                 .Select(i => factory.Create().GetString())
                 .ToList();

i.e. without any explicit construction of the intermediate container.

You may choose to soften the interface to the method to public IList<string> DoStuffInParallel or even to IEnumerable<string> (the minimum possible contract / commitment). The dependency here is on the Worker, which is what you will want to be able to mock in unit testing.