vectorize numpy mean across the slices of an array

2019-07-24 17:40发布

问题:

Is there a way to vectorize a function so that the output would be an array of means where each mean represents the mean of the values from 0-index of the input array? Looping this is pretty straightforward but I am trying to be as efficient as possible. e.g. 0 = mean(0), 1 = mean(0-1), N = mean(0-N)

回答1:

The intended operation could be coined as cumulative averaging. So, an obvious solution would involve cumulative summation and dividing those summations by the number of elements participating for each such summation. Thus, a vectorized implementation would involve np.cumsum and then dividing by the number of participating elements that could be obtained with np.arange and generalized for an ndarray, like so -

def cummean(A,axis):
    """ Cumulative averaging

    Parameters
    ----------    
    A    : input ndarray
    axis : axis along which operation is to be performed

    Output
    ------    
    Output : Cumulative averages along the specified axis of input ndarray
    """

    return np.true_divide(A.cumsum(axis),np.arange(1,A.shape[axis]+1))


回答2:

If you're able to use pandas there is expanding_mean which will work directly with a NumPy array:

In [10]: pandas.expanding_mean(np.arange(1, 11))
Out[10]: array([ 1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5,  5. ,  5.5])

This method also works column-wise:

In [11]: A = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 
                       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]).T

In [12]: A
Out[12]: 
array([[ 1,  1],
       [ 2,  1],
       [ 3,  1],
       [ 4,  1],
       [ 5,  1],
       [ 6,  1],
       [ 7,  1],
       [ 8,  1],
       [ 9,  1],
       [10,  1]])

In [13]: pandas.expanding_mean(A)
Out[13]: 
array([[ 1. ,  1. ],
       [ 1.5,  1. ],
       [ 2. ,  1. ],
       [ 2.5,  1. ],
       [ 3. ,  1. ],
       [ 3.5,  1. ],
       [ 4. ,  1. ],
       [ 4.5,  1. ],
       [ 5. ,  1. ],
       [ 5.5,  1. ]])