NumPy sum along disjoint indices

2019-07-03 11:07发布

问题:

I have an application where I need to sum across arbitrary groups of indices in a 3D NumPy array. The built-in NumPy array sum routine sums up all indices along one of the dimensions of an ndarray. Instead, I need to sum up ranges of indices along one of the dimensions in my array and return a new array.

For example, let's assume that I have an ndarray with shape (70,25,3). I wish to sum up the first dimension along certain index ranges and return a new 3D array. Consider the sum from 0:25, 25:50 and 50:75 which would return an array of shape (3,25,3).

Is there an easy way to do "disjoint sums" along one dimension of a NumPy array to produce this result?

回答1:

You can use np.add.reduceat as a general approach to this problem. This works even if the ranges are not all the same length.

To sum the slices 0:25, 25:50 and 50:75 along axis 0, pass in indices [0, 25, 50]:

np.add.reduceat(a, [0, 25, 50], axis=0)

This method can also be used to sum non-contiguous ranges. For instance, to sum the slices 0:25, 37:47 and 51:75, write:

np.add.reduceat(a, [0,25, 37,47, 51], axis=0)[::2]

An alternative approach to summing ranges of the same length is to reshape the array and then sum along an axis. The equivalent to the first example above would be:

a.reshape(3, a.shape[0]//3, a.shape[1], a.shape[2]).sum(axis=1)


回答2:

Just sum each portion and use the results to create a new array.

import numpy as np
i1, i2 = (2,7)

a = np.ones((10,5,3))
b = np.sum(a[0:i1,...], 0)
c = np.sum(a[i1:i2,...], 0)
d = np.sum(a[i2:,...], 0)

g = np.array([b,c,d])

>>> g.shape
(3, 5, 3)
>>> g
array([[[ 2.,  2.,  2.],
        [ 2.,  2.,  2.],
        [ 2.,  2.,  2.],
        [ 2.,  2.,  2.],
        [ 2.,  2.,  2.]],

       [[ 5.,  5.,  5.],
        [ 5.,  5.,  5.],
        [ 5.,  5.,  5.],
        [ 5.,  5.,  5.],
        [ 5.,  5.,  5.]],

       [[ 3.,  3.,  3.],
        [ 3.,  3.,  3.],
        [ 3.,  3.,  3.],
        [ 3.,  3.,  3.],
        [ 3.,  3.,  3.]]])
>>>


回答3:

You can use np.split to split your array then use np.sum to sum your items along the second axis :

np.sum(np.split(my_array,3),axis=1)

Demo:

>>> a=np.arange(270).reshape(30,3,3)
>>> np.sum(np.split(a,3),axis=1)
array([[[ 405,  415,  425],
        [ 435,  445,  455],
        [ 465,  475,  485]],

       [[1305, 1315, 1325],
        [1335, 1345, 1355],
        [1365, 1375, 1385]],

       [[2205, 2215, 2225],
        [2235, 2245, 2255],
        [2265, 2275, 2285]]])

Also note that if you have a different slice lengths you can pass the end of you slices to np.split function :

>>> new=np.sum(np.split(a,[10,20,]),axis=1)
>>> new
array([[[ 405,  415,  425],
        [ 435,  445,  455],
        [ 465,  475,  485]],

       [[1305, 1315, 1325],
        [1335, 1345, 1355],
        [1365, 1375, 1385]],

       [[2205, 2215, 2225],
        [2235, 2245, 2255],
        [2265, 2275, 2285]]])