Numpy rolling window over 2D array, as a 1D array

2019-07-09 03:07发布

When using np.lib.stride_tricks.as_strided, how can I manage 2D a array with the nested arrays as data values? Is there a preferable efficient approach?

Specifically, if I have a 2D np.array looking as follows, where each data item in a 1D array is an array of length 2:

[[1., 2.],[3., 4.],[5.,6.],[7.,8.],[9.,10.]...]

I want to reshape for rolling over as follows:

[[[1., 2.],[3., 4.],[5.,6.]],
 [[3., 4.],[5.,6.],[7.,8.]],
 [[5.,6.],[7.,8.],[9.,10.]],
  ...
]

I have had a look at similar answers (e.g. this rolling window function), however in use I cannot leave the inner array/tuples untouched.

For example with a window length of 3: I have tried a shape of (len(seq)+3-1, 3, 2) and a stride of (2 * 8, 2 * 8, 8), but no luck. Maybe I am missing something obvious?

Cheers.


EDIT: It is easy to produce a functionally identical solution using Python built-ins (which can be optimised using e.g. np.arange similar to Divakar's solution), however, what about using as_strided? From my understanding, this could be used for a highly efficient solution?

3条回答
ゆ 、 Hurt°
2楼-- · 2019-07-09 03:42

You task is similar to this one. So I slightly changed it.

# Rolling window for 2D arrays in NumPy
import numpy as np

def rolling_window(a, shape):  # rolling window for 2D array
    s = (a.shape[0] - shape[0] + 1,) + (a.shape[1] - shape[1] + 1,) + shape
    strides = a.strides + a.strides
    return np.lib.stride_tricks.as_strided(a, shape=s, strides=strides)

x = np.array([[1,2],[3,4],[5,6],[7,8],[9,10],[3,4],[5,6],[7,8],[11,12]])
y = np.array([[3,4],[5,6],[7,8]])
found = np.all(np.all(rolling_window(x, y.shape) == y, axis=2), axis=2)
print(found.nonzero()[0])
查看更多
爱情/是我丢掉的垃圾
3楼-- · 2019-07-09 03:50

What was wrong with your as_strided trial? It works for me.

In [28]: x=np.arange(1,11.).reshape(5,2)
In [29]: x.shape
Out[29]: (5, 2)
In [30]: x.strides
Out[30]: (16, 8)
In [31]: np.lib.stride_tricks.as_strided(x,shape=(3,3,2),strides=(16,16,8))
Out[31]: 
array([[[  1.,   2.],
        [  3.,   4.],
        [  5.,   6.]],

       [[  3.,   4.],
        [  5.,   6.],
        [  7.,   8.]],

       [[  5.,   6.],
        [  7.,   8.],
        [  9.,  10.]]])

On my first edit I used an int array, so had to use (8,8,4) for the strides.

Your shape could be wrong. If too large it starts seeing values off the end of the data buffer.

   [[  7.00000000e+000,   8.00000000e+000],
    [  9.00000000e+000,   1.00000000e+001],
    [  8.19968827e-257,   5.30498948e-313]]])

Here it just alters the display method, the 7, 8, 9, 10 are still there. Writing those those slots could be dangerous, messing up other parts of your code. as_strided is best if used for read-only purposes. Writes/sets are trickier.

查看更多
成全新的幸福
4楼-- · 2019-07-09 04:00

IIUC you could do something like this -

def rolling_window2D(a,n):
    # a: 2D Input array 
    # n: Group/sliding window length
    return a[np.arange(a.shape[0]-n+1)[:,None] + np.arange(n)]

Sample run -

In [110]: a
Out[110]: 
array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

In [111]: rolling_window2D(a,3)
Out[111]: 
array([[[ 1,  2],
        [ 3,  4],
        [ 5,  6]],

       [[ 3,  4],
        [ 5,  6],
        [ 7,  8]],

       [[ 5,  6],
        [ 7,  8],
        [ 9, 10]]])
查看更多
登录 后发表回答