Downsample a 1D numpy array

2019-01-23 04:19发布

问题:

I have a 1-d numpy array which I would like to downsample. Any of the following methods are acceptable if the downsampling raster doesn't perfectly fit the data:

  • overlap downsample intervals
  • convert whatever number of values remains at the end to a separate downsampled value
  • interpolate to fit raster

basically if I have

1 2 6 2 1

and I am downsampling by a factor of 3, all of the following are ok:

3 3

3 1.5

or whatever an interpolation would give me here.

I'm just looking for the fastest/easiest way to do this.

I found scipy.signal.decimate, but that sounds like it decimates the values (takes them out as needed and only leaves one in X). scipy.signal.resample seems to have the right name, but I do not understand where they are going with the whole fourier thing in the description. My signal is not particularly periodic.

Could you give me a hand here? This seems like a really simple task to do, but all these functions are quite intricate...

回答1:

In the simple case where your array's size is divisible by the downsampling factor (R), you can reshape your array, and take the mean along the new axis:

import numpy as np
a = np.array([1.,2,6,2,1,7])
R = 3
a.reshape(-1, R)
=> array([[ 1.,  2.,  6.],
         [ 2.,  1.,  7.]])

a.reshape(-1, R).mean(axis=1)
=> array([ 3.        ,  3.33333333])

In the general case, you can pad your array with NaNs to a size divisible by R, and take the mean using scipy.nanmean.

import math, scipy
b = np.append(a, [ 4 ])
b.shape
=> (7,)
pad_size = math.ceil(float(b.size)/R)*R - b.size
b_padded = np.append(b, np.zeros(pad_size)*np.NaN)
b_padded.shape
=> (9,)
scipy.nanmean(b_padded.reshape(-1,R), axis=1)
=> array([ 3.        ,  3.33333333,  4.])


回答2:

If array size is not divisible by downsampling factor (R), reshaping (splitting) of array can be done using np.linspace followed by mean of each subarray.

input_arr = np.arange(531)

R = 150 (number of split)

split_arr = np.linspace(0, len(input_arr), num=R+1, dtype=int)

dwnsmpl_subarr = np.split(input_arr, split_arr[1:])

dwnsmpl_arr = np.array( list( np.mean(item) for item in dwnsmpl_subarr[:-1] ) )