I have an image stored as a 2d numpy array (possibly multi-d).
I can make a view onto that array that reflects a 2d sliding window, but when I reshape it so that each row is a flattened window (rows are windows, column is a pixel in that window) python makes a full copy. It does this because I'm using the typical stride trick, and the new shape isn't contiguous in memory.
I need this because I'm passing entire large images to an sklearn classifier, which accepts 2d matrices, where there's no batch/partial fit procedure, and the full expanded copy is far too large for memory.
My Question: Is there a way to do this without making a fully copy of the view?
I believe an answer will either be (1) something about strides or numpy memory management that I've overlooked, or (2) some kind of masked memory structure for python that can emulate a numpy array even to an external package like sklearn that includes cython.
This task of training over moving windows of a 2d image in memory is common, but the only attempt I know of to account for patches directly is the Vigra project (http://ukoethe.github.io/vigra/).
Thanks for the help.
>>> A=np.arange(9).reshape(3,3)
>>> print A
[[0 1 2]
[3 4 5]
[6 7 8]]
>>> xstep=1;ystep=1; xsize=2; ysize=2
>>> window_view = np.lib.stride_tricks.as_strided(A, ((A.shape[0] - xsize + 1) / xstep, (A.shape[1] - ysize + 1) / ystep, xsize, ysize),
... (A.strides[0] * xstep, A.strides[1] * ystep, A.strides[0], A.strides[1]))
>>> print window_view
[[[[0 1]
[3 4]]
[[1 2]
[4 5]]]
[[[3 4]
[6 7]]
[[4 5]
[7 8]]]]
>>>
>>> np.may_share_memory(A,window_view)
True
>>> B=window_view.reshape(-1,xsize*ysize)
>>> np.may_share_memory(A,B)
False
Your task isn't possible using only strides, but NumPy does support
one kind of array that does the job. With strides and masked_array
you can create the desired view to your data. However, not all
NumPy functions support operations with masked_array
, so it is
possible the scikit-learn doesn't do well with these either.
Let's first take a fresh look at what we are trying to do here.
Consider the input data of your example. Fundamentally the data is
just a 1-d array in the memory, and it is simpler if we think about
the strides with that. The array only appears to be 2-d, because we
have defined its shape. Using strides, the shape could be defined
like this:
from numpy.lib.stride_tricks import as_strided
base = np.arange(9)
isize = base.itemsize
A = as_strided(base, shape=(3, 3), strides=(3 * isize, isize))
Now the goal is to set such strides to base
that it orders the
numbers like in the end array, B
. In other words, we are asking for
integers a
and b
such that
>>> as_strided(base, shape=(4, 4), strides=(a, b))
array([[0, 1, 3, 4],
[1, 2, 4, 5],
[3, 4, 6, 7],
[4, 5, 7, 8]])
But this is clearly impossible. The closest view we can achieve like
this is with a rolling window over base
:
>>> C = as_strided(base, shape=(5, 5), strides=(isize, isize))
>>> C
array([[0, 1, 2, 3, 4],
[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[3, 4, 5, 6, 7],
[4, 5, 6, 7, 8]])
But the difference here is that we have extra columns and rows, which
we would like to get rid of. So, effectively we are asking for a
rolling window which is not contiguous and also makes jumps at regular
intervals. With this example we want to have every third item
excluded from the window and jump over one item after two rows.
We can describe this as a masked_array
:
>>> mask = np.zeros((5, 5), dtype=bool)
>>> mask[2, :] = True
>>> mask[:, 2] = True
>>> D = np.ma.masked_array(C, mask=mask)
This array contains exactly the data that we want, and it is only a
view to the original data. We can confirm that the data is equal
>>> D.data[~D.mask].reshape(4, 4)
array([[0, 1, 3, 4],
[1, 2, 4, 5],
[3, 4, 6, 7],
[4, 5, 7, 8]])
But as I said in the beginning, it is quite likely that scikit-learn
doesn't understand masked arrays. If it simply converts this to an
array, the data will be wrong:
>>> np.array(D)
array([[0, 1, 2, 3, 4],
[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[3, 4, 5, 6, 7],
[4, 5, 6, 7, 8]])