MATLAB's smooth implementation (n-point moving

2020-06-28 07:04发布

问题:

Matlab's smooth function, by default, smooths data using a 5-point moving average. What would be the best way to do the same in python? For example, if this is my data

0
0.823529411764706
0.852941176470588
0.705882352941177
0.705882352941177
0.676470588235294
0.676470588235294
0.500000000000000
0.558823529411765
0.647058823529412
0.705882352941177
0.705882352941177
0.617647058823529
0.705882352941177
0.735294117647059
0.735294117647059
0.588235294117647
0.588235294117647
1
0.647058823529412
0.705882352941177
0.764705882352941
0.823529411764706
0.647058823529412
0.735294117647059
0.794117647058824
0.794117647058824
0.705882352941177
0.676470588235294
0.794117647058824
0.852941176470588
0.735294117647059
0.647058823529412
0.647058823529412
0.676470588235294
0.676470588235294
0.529411764705882
0.676470588235294
0.794117647058824
0.882352941176471
0.735294117647059
0.852941176470588
0.823529411764706
0.764705882352941
0.558823529411765
0.588235294117647
0.617647058823529
0.647058823529412
0.588235294117647
0.617647058823529
0.647058823529412
0.794117647058824
0.823529411764706
0.647058823529412
0.617647058823529
0.647058823529412
0.676470588235294
0.764705882352941
0.676470588235294
0.647058823529412
0.705882352941177
0.764705882352941
0.705882352941177
0.500000000000000
0.529411764705882
0.529411764705882
0.647058823529412
0.676470588235294
0.588235294117647
0.735294117647059
0.794117647058824
0.852941176470588
0.764705882352941

the smoothed data should be

0
0.558823529411765
0.617647058823530
0.752941176470588
0.723529411764706
0.652941176470588
0.623529411764706
0.611764705882353
0.617647058823530
0.623529411764706
0.647058823529412
0.676470588235294
0.694117647058824
0.700000000000000
0.676470588235294
0.670588235294118
0.729411764705882
0.711764705882353
0.705882352941177
0.741176470588235
0.788235294117647
0.717647058823529
0.735294117647059
0.752941176470588
0.758823529411765
0.735294117647059
0.741176470588235
0.752941176470588
0.764705882352941
0.752941176470588
0.741176470588235
0.735294117647059
0.711764705882353
0.676470588235294
0.635294117647059
0.641176470588236
0.670588235294118
0.711764705882353
0.723529411764706
0.788235294117647
0.817647058823530
0.811764705882353
0.747058823529412
0.717647058823530
0.670588235294118
0.635294117647059
0.600000000000000
0.611764705882353
0.623529411764706
0.658823529411765
0.694117647058824
0.705882352941176
0.705882352941176
0.705882352941176
0.682352941176471
0.670588235294118
0.676470588235294
0.682352941176471
0.694117647058824
0.711764705882353
0.700000000000000
0.664705882352941
0.641176470588236
0.605882352941177
0.582352941176471
0.576470588235294
0.594117647058824
0.635294117647059
0.688235294117647
0.729411764705882
0.747058823529412
0.803921568627451
0.764705882352941

The syntax in Matlab to get this is

smooth(data)

I want to do the same in python but I am unable to find any function that would do this.

回答1:

MATLAB's smoooth func is basically same as averaging across sliding windows of length 5, except the way it treats the 2 elems at either ends. As per the linked docs, those boundary cases are computed with these formulae -

yy = smooth(y) smooths the data in the column vector y ..
The first few elements of yy are given by

yy(1) = y(1)
yy(2) = (y(1) + y(2) + y(3))/3
yy(3) = (y(1) + y(2) + y(3) + y(4) + y(5))/5
yy(4) = (y(2) + y(3) + y(4) + y(5) + y(6))/5
...

So, to replicate the same implementation on NumPy/Python, we can use NumPy's 1D convolution for getting sliding windowed summations and divide them by the window length to give us the average results. Then, simply append the special case treated values for the boundary elems.

Thus, we would have an implementation to handle generic window sizes, like so -

def smooth(a,WSZ):
    # a: NumPy 1-D array containing the data to be smoothed
    # WSZ: smoothing window size needs, which must be odd number,
    # as in the original MATLAB implementation
    out0 = np.convolve(a,np.ones(WSZ,dtype=int),'valid')/WSZ    
    r = np.arange(1,WSZ-1,2)
    start = np.cumsum(a[:WSZ-1])[::2]/r
    stop = (np.cumsum(a[:-WSZ:-1])[::2]/r)[::-1]
    return np.concatenate((  start , out0, stop  ))