VB.NET running sum in nested loop inside Parallel.

2019-08-30 00:09发布

问题:

Below is the best representation I have been able to develop for calculating a running sum inside a loop that's nested inside a Parallel.for loop in VB.NET (Visual Studio 2010, .NET Framework 4). Note that when showing the results in `sum' to the screen, there is a slight difference between the two sums, and hence loss of information in the parallelized variant. So how is the information being lost, and what's happening? Can anyone offer some "microsurgery" on methodology for keeping a running sum in this context? (Note to new users of Parallel.for: I typically don't use zero-based methods, so in the Parallel.for statement the I1 loops up to 101, since the code uses 101-1 as the upper bound. This is because MS developed the parallel code assuming zero-based counters):

    Dim sum As Double = 0
    Dim lock As New Object
    Dim clock As New Stopwatch
    Dim i, j As Integer
    clock.Start()
    sum = 0
    For i = 1 To 100
        For j = 1 To 100
            sum += Math.Log(0.9999)
        Next j
    Next i
    clock.Stop()
    MsgBox(sum & "  " & clock.ElapsedMilliseconds)
    sum = 0
    clock.Reset()
    clock.Start()
    Parallel.For(1, 101, Sub(i1)
                             Dim temp As Double = 0
                             For j1 As Integer = 1 To 100
                                 temp += Math.Log(0.9999)
                             Next
                             SyncLock lock
                                 sum += temp
                             End SyncLock
                         End Sub)
    clock.Stop()
    MsgBox(sum & "  " & clock.ElapsedMilliseconds)    

回答1:

You are working with doubles and double are simply not accurate. In the non parallel loop, all errors are stored directly in sum. In the parallel loop you have an additional tmp that is later added to sum. Use the same tmp in your non parallel loop (adding to sum after the inner loop has run) and eventually the results wil be equal then.

 Dim sum As Double = 0
    Dim lock As New Object
    Dim clock As New Stopwatch
    Dim i, j As Integer
    clock.Start()
    sum = 0
    For i = 1 To 100
        For j = 1 To 100
            sum += Math.Log(0.9999)
        Next j
    Next i
    clock.Stop()
    Console.WriteLine(sum & "  " & clock.ElapsedMilliseconds)
    sum = 0
    clock.Reset()

    clock.Start()
    sum = 0
    For i = 1 To 100
        Dim tmp As Double = 0
        For j = 1 To 100
            tmp += Math.Log(0.9999)
        Next
        sum += tmp
    Next i
    clock.Stop()
    Console.WriteLine(sum & "  " & clock.ElapsedMilliseconds)
    sum = 0
    clock.Reset()

    clock.Start()
    Parallel.For(1, 101, Sub(i1)
                             Dim temp As Double = 0
                             For j1 As Integer = 1 To 100
                                 temp += Math.Log(0.9999)
                             Next
                             SyncLock lock
                                 sum += temp
                             End SyncLock
                         End Sub)
    clock.Stop()
    Console.WriteLine(sum & "  " & clock.ElapsedMilliseconds)

End Sub

output:

-1,00005000333357  0
-1,00005000333347  0
-1,00005000333347  26

Conclusion: If you work with double, then (a + b) + c is NOT (always) equal to a + (b + c)

UPDATE

an even more simple example:

    Dim sum As Double
    For i = 1 To 100
        sum += 0.1
    Next
    Console.WriteLine(sum)

    sum = 0
    For i = 1 To 2
        Dim tmp As Double = 0
        For j = 1 To 50
            tmp += 0.1
        Next
        sum += tmp
    Next
    Console.WriteLine(sum)

now the output is

9,99999999999998
10