Weird numpy.sum behavior when adding zeros

I understand how mathematically-equivalent arithmentic operations can result in different results due to numerical errors (e.g. summing floats in different orders).

However, it surprises me that adding zeros to sum can change the result. I thought that this always holds for floats, no matter what: x + 0. == x.

Here's an example. I expected all the lines to be exactly zero. Can anybody please explain why this happens?

M = 4  # number of random values
Z = 4  # number of additional zeros
for i in range(20):
    a = np.random.rand(M)
    b = np.zeros(M+Z)
    b[:M] = a
    print a.sum() - b.sum()

-4.4408920985e-16
0.0
0.0
0.0
4.4408920985e-16
0.0
-4.4408920985e-16
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
2.22044604925e-16
0.0
4.4408920985e-16
4.4408920985e-16
0.0

It seems not to happen for smaller values of M and Z.

I also made sure a.dtype==b.dtype.

Here is one more example, which also demonstrates python's builtin sum behaves as expected:

a = np.array([0.1,      1.0/3,      1.0/7,      1.0/13, 1.0/23])
b = np.array([0.1, 0.0, 1.0/3, 0.0, 1.0/7, 0.0, 1.0/13, 1.0/23])
print a.sum() - b.sum()
=> -1.11022302463e-16
print sum(a) - sum(b)
=> 0.0

I'm using numpy V1.9.2.

Short answer: You are seeing the difference between

a + b + c + d

and

(a + b) + (c + d)

which because of floating point inaccuracies is not the same.

Long answer: Numpy implements pair-wise summation as an optimization of both speed (it allows for easier vectorization) and rounding error.

The numpy sum-implementation can be found here (function pairwise_sum_@TYPE@). It essentially does the following:

If the length of the array is less than 8, a regular for-loop summation is performed. This is why the strange result is not observed if W < 4 in your case - the same for-loop summation will be used in both cases.
If the length is between 8 and 128, it accumulates the sums in 8 bins r[0]-r[7] then sums them by ((r[0] + r[1]) + (r[2] + r[3])) + ((r[4] + r[5]) + (r[6] + r[7])).
Otherwise, it recursively sums two halves of the array.

Therefore, in the first case you get a.sum() = a[0] + a[1] + a[2] + a[3] and in the second case b.sum() = (a[0] + a[1]) + (a[2] + a[3]) which leads to a.sum() - b.sum() != 0.