tl;dr Can I reshape a view of a numpy array from 5x5x5x3x3x3 to 125x1x1x3x3x3 without using numpy.reshape?
I would like to perform a sliding window operation (with different strides) to a volume (size of MxMxM). The sliding window array can be generated with the use of numpy.lib.stride_tricks.as_strided
, as previously suggested by Benjamin and Eickenberg, and demonstrated in the below code snippet, which uses a helper method from skimage that uses as_strided
.
The output from this helper method gives me a shape of NxNxNxnxnxn, but I'd prefer the shape to be N^3x1xnxnxn. While I can use np.reshape to achieve this, np.reshape is slow if the volume gets large (> 100x100x100), which I'm not sure why. I thought I can use as_stride to reshape the output, but numpy crashes (code snippet below). Any ideas on how I can get a view of the output from the helper method as N**3x1xnxnxn without using np.reshape?
import numpy as np
import skimage
l = 15
s = 3
X = np.ones((l,l,l))
print('actual shape',X.shape)
view = skimage.util.shape.view_as_blocks(X,(s,s,s))
print('original view',view.shape)
new_shape = ((l/s)**3,1,1,s,s,s)
print('new view',new_shape)
view_correct = view.reshape(new_shape)
print(view_correct.shape)
print('coord:','124,0,0,2,2,2','value:',view_correct[124,0,0,2,2,2])
view_incorrect = np.lib.stride_tricks.as_strided(view, shape=new_shape)
print(view_incorrect.shape)
print('coord:','124,0,0,2,2,2','value:',view_incorrect[124,0,0,2,2,2])
I took an example from view_as_blocks
, and tried your style of reshape:
A = np.arange(4*4).reshape(4,4)
B = view_as_blocks(A, block_shape=(2, 2))
print(A.__array_interface__)
print(B.__array_interface__)
C = B.reshape((2*2,2,2))
print(C.__array_interface__)
producing:
{'typestr': '<i4', 'data': (153226600, False), 'shape': (4, 4),
'descr': [('', '<i4')], 'version': 3, 'strides': None}
{'typestr': '<i4', 'data': (153226600, False), 'shape': (2, 2, 2, 2),
'descr': [('', '<i4')], 'version': 3, 'strides': (32, 8, 16, 4)}
{'typestr': '<i4', 'data': (150895960, False), 'shape': (4, 2, 2),
'descr': [('', '<i4')], 'version': 3, 'strides': None}
The data
pointer for A
and B
is the same; B
is a view on A
.
But the pointer for C
is different. It is a copy. That explains why it takes so long in your case.
Lets do that a little differently:
A = np.arange(4*4).reshape(4,4)
B = view_as_blocks(A, block_shape=(2, 2))
print(A.__array_interface__)
print(B.__array_interface__)
C = B.reshape((2*2,1,2,2))
print(C.__array_interface__)
D = as_strided(B, shape=(2*2,1,2,2))
print(D.__array_interface__)
print(B[1,1,:,:])
print(C[3,0,:,:])
print(D[3,0,:,:])
producing
1254:~/mypy$ python3 skshape.py
{'strides': None, 'typestr': '<i4', 'version': 3,
'data': (154278744, False), 'shape': (4, 4), 'descr': [('', '<i4')]}
{'strides': (32, 8, 16, 4), 'typestr': '<i4', 'version': 3,
'data': (154278744, False), 'shape': (2, 2, 2, 2), 'descr': [('', '<i4')]}
{'strides': None, 'typestr': '<i4', 'version': 3,
'data': (155705400, False), 'shape': (4, 1, 2, 2), 'descr': [('', '<i4')]}
{'strides': (32, 8, 16, 4), 'typestr': '<i4', 'version': 3,
'data': (154278744, False), 'shape': (4, 1, 2, 2), 'descr': [('', '<i4')]}
[[10 11]
[14 15]]
[[10 11]
[14 15]]
[[ 154561960 -1217783696]
[ 48 3905]]
Again the reshape creates a copy. The 2nd as_strides
returns a view, but the striding is screwed up. It is looking at memory outside the original data buffer (that's part of why playing with strides on your own is dangerous).
In my example, look at the first corner value of each block
print(B[:,:,0,0])
print(C[:,0,0,0])
[[ 0 2]
[ 8 10]]
[ 0 2 8 10]
For B
, the rows increase by 8, columns by 2; that's reflected in the (32,8)
(4*8,4*2) striding.
But in C
the steps are (2,6,2) - striding can't do that.
From this I conclude that the reshape is impossible without copy.