Parallel iteration in C#?

2019-03-16 03:54发布

问题:

Is there a way to do foreach style iteration over parallel enumerables in C#? For subscriptable lists, I know one could use a regular for loop iterating an int over the index range, but I really prefer foreach to for for a number of reasons.

Bonus points if it works in C# 2.0

回答1:

Short answer, no. foreach works on only one enumerable at a time.

However, if you combine your parallel enumerables into a single one, you can foreach over the combined. I am not aware of any easy, built in method of doing this, but the following should work (though I have not tested it):

public IEnumerable<TSource[]> Combine<TSource>(params object[] sources)
{
    foreach(var o in sources)
    {
        // Choose your own exception
        if(!(o is IEnumerable<TSource>)) throw new Exception();
    }

    var enums =
        sources.Select(s => ((IEnumerable<TSource>)s).GetEnumerator())
        .ToArray();

    while(enums.All(e => e.MoveNext()))
    {
        yield return enums.Select(e => e.Current).ToArray();
    }
}

Then you can foreach over the returned enumerable:

foreach(var v in Combine(en1, en2, en3))
{
    // Remembering that v is an array of the type contained in en1,
    // en2 and en3.
}


回答2:

.NET 4's BlockingCollection makes this pretty easy. Create a BlockingCollection, return its .GetConsumingEnumerable() in the enumerable method. Then the foreach simply adds to the blocking collection.

E.g.

private BlockingCollection<T> m_data = new BlockingCollection<T>();

public IEnumerable<T> GetData( IEnumerable<IEnumerable<T>> sources )
{
    Task.Factory.StartNew( () => ParallelGetData( sources ) );
    return m_data.GetConsumingEnumerable();
}

private void ParallelGetData( IEnumerable<IEnumerable<T>> sources )
{
    foreach( var source in sources )
    {
        foreach( var item in source )
        {
            m_data.Add( item );
        };
    }

    //Adding complete, the enumeration can stop now
    m_data.CompleteAdding();
}

Hope this helps. BTW I posted a blog about this last night

Andre



回答3:

Zooba's answer is good, but you might also want to look at the answers to "How to iterate over two arrays at once".



回答4:

I wrote an implementation of EachParallel() from the .NET4 Parallel library. It is compatible with .NET 3.5: Parallel ForEach Loop in C# 3.5 Usage:

string[] names = { "cartman", "stan", "kenny", "kyle" };
names.EachParallel(name =>
{
    try
    {
        Console.WriteLine(name);
    }
    catch { /* handle exception */ }
});

Implementation:

/// <summary>
/// Enumerates through each item in a list in parallel
/// </summary>
public static void EachParallel<T>(this IEnumerable<T> list, Action<T> action)
{
    // enumerate the list so it can't change during execution
    list = list.ToArray();
    var count = list.Count();

    if (count == 0)
    {
        return;
    }
    else if (count == 1)
    {
        // if there's only one element, just execute it
        action(list.First());
    }
    else
    {
        // Launch each method in it's own thread
        const int MaxHandles = 64;
        for (var offset = 0; offset < list.Count() / MaxHandles; offset++)
        {
            // break up the list into 64-item chunks because of a limitiation             // in WaitHandle
            var chunk = list.Skip(offset * MaxHandles).Take(MaxHandles);

            // Initialize the reset events to keep track of completed threads
            var resetEvents = new ManualResetEvent[chunk.Count()];

            // spawn a thread for each item in the chunk
            int i = 0;
            foreach (var item in chunk)
            {
                resetEvents[i] = new ManualResetEvent(false);
                ThreadPool.QueueUserWorkItem(new WaitCallback((object data) =>
                {
                    int methodIndex = (int)((object[])data)[0];

                    // Execute the method and pass in the enumerated item
                    action((T)((object[])data)[1]);

                    // Tell the calling thread that we're done
                    resetEvents[methodIndex].Set();
                }), new object[] { i, item });
                i++;
            }

            // Wait for all threads to execute
            WaitHandle.WaitAll(resetEvents);
        }
    }
}


回答5:

If you want to stick to the basics - I rewrote the currently accepted answer in a simpler way:

    public static IEnumerable<TSource[]> Combine<TSource> (this IEnumerable<IEnumerable<TSource>> sources)
    {
        var enums = sources
            .Select (s => s.GetEnumerator ())
            .ToArray ();

        while (enums.All (e => e.MoveNext ())) {
            yield return enums.Select (e => e.Current).ToArray ();
        }
    }

    public static IEnumerable<TSource[]> Combine<TSource> (params IEnumerable<TSource>[] sources)
    {
        return sources.Combine ();
    }


回答6:

Would this work for you?

public static class Parallel
{
    public static void ForEach<T>(IEnumerable<T>[] sources,
                                  Action<T> action)
    {
        foreach (var enumerable in sources)
        {
            ThreadPool.QueueUserWorkItem(source => {
                foreach (var item in (IEnumerable<T>)source)
                    action(item);
            }, enumerable);
        }
    }
}

// sample usage:
static void Main()
{
    string[] s1 = { "1", "2", "3" };
    string[] s2 = { "4", "5", "6" };
    IEnumerable<string>[] sources = { s1, s2 };
    Parallel.ForEach(sources, s => Console.WriteLine(s));
    Thread.Sleep(0); // allow background threads to work
}

For C# 2.0, you need to convert the lambda expressions above to delegates.

Note: This utility method uses background threads. You may want to modify it to use foreground threads, and probably you'll want to wait till all threads finish. If you do that, I suggest you create sources.Length - 1 threads, and use the current executing thread for the last (or first) source.

(I wish I could include waiting for threads to finish in my code, but I'm sorry that I don't know how to do that yet. I guess you should use a WaitHandle Thread.Join().)