TPL Data Parallelism Issue

2019-08-03 16:33发布

I have a situation to process the set of data in parallel, in the end I want to know how many of them in total have been processed successfully. I come with following dummy code by following the sample at http://msdn.microsoft.com/en-us/library/dd460703.aspx and http://reedcopsey.com/2010/01/22/parallelism-in-net-part-4-imperative-data-parallelism-aggregation/

    public void DoWork2()
    {
        int sum = 0;
        Parallel.For<int>(0, 10,
            () => 0,
            (i, lockState, localState) =>
            {
                DummyEntity entity = DoWork3(i);
                if (entity != null)
                {
                    Console.WriteLine("Processed {0}, sum need to be increased by 1.", i);
                    return 1;
                }
                else
                {
                    Console.WriteLine("Processed {0}, sum need to be increased by 0.", i);
                    return 0;
                }
            },
            localState =>
            {
                lock (syncRoot)
                {
                    Console.WriteLine("Increase sum {0} by {1}", sum, localState);
                    sum += localState;
                }
            }
            );
        Console.WriteLine("Total items {0}", sum);
    }

    private DummyEntity DoWork3(int i)
    {
        if (i % 2 == 0)
        {
            return new DummyEntity();
        }
        else
        {
            return null;
        }
    }

However the result changes every time I run. I think there is some thing wrong with the code. But could not figure out why.

1条回答
我命由我不由天
2楼-- · 2019-08-03 17:15

Your problem is your choice in overloads. You've stored local state information to minimize the use of global state, yet you're not using the local state.

If you note from the example you gave they use the subtotal (what you've called localState) in the body of the loop:

subtotal += nums[j];
return subtotal;

Compare this to your code (made a bit more concise):

if (entity != null)
{
    return 1;
}
else
{
    return 0;
}

No mention of localState is there, so you've effectively thrown away some of the answers. If you change it instead to read:

if (entity != null)
{
    return localState + 1;
}
else
{
    return localState;
}

You'll find the following answer on the command line (for this given problem):

Total items 5

This usage of local state is in order to reduce access to shared state.

Here is a snippet from using 0..50 as the range:

Processed 22, sum need to be increased by 1.
Processed 23, sum need to be increased by 0.
Increase sum 0 by 1
Processed 8, sum need to be increased by 1.
Processed 9, sum need to be increased by 0.
Processed 10, sum need to be increased by 1.
Processed 11, sum need to be increased by 0.
Increase sum 1 by 2
Increase sum 3 by 8
Increase sum 11 by 10
Processed 16, sum need to be increased by 1.
Processed 17, sum need to be increased by 0.
Processed 18, sum need to be increased by 1.
Increase sum 21 by 4
Total items 25
查看更多
登录 后发表回答