I have a numpy structured array of the following form:
x = np.array([(1,2,3)]*2, [('t', np.int16), ('x', np.int8), ('y', np.int8)])
I now want to generate views into this array that team up 't'
with either 'x'
or 'y'
. The usual syntax creates a copy:
v_copy = x[['t', 'y']]
v_copy
#array([(1, 3), (1, 3)],
# dtype=[('t', '<i2'), ('y', '|i1')])
v_copy.base is None
#True
This is not unexpected, since picking two fields is "fancy indexing", at which point numpy gives up and makes a copy. Since my actual records are large, I want to avoid the copy at all costs.
It is not at all true that the required elements cannot be accessed within numpy's strided memory model. Looking at the individual bytes in memory:
x.view(np.int8)
#array([1, 0, 2, 3, 1, 0, 2, 3], dtype=int8)
one can figure out the necessary strides:
v = np.recarray((2,2), [('b', np.int8)], buf=x, strides=(4,3))
v
#rec.array([[(1,), (3,)],
# [(1,), (3,)]],
# dtype=[('b', '|i1')])
v.base is x
#True
Clearly, v
points to the correct locations in memory without having created a copy. Unfortunately, numpy won't allow me to reinterpret these memory locations as the original data types:
v_view = v.view([('t', np.int16), ('y', np.int8)])
#ValueError: new type not compatible with array.
Is there a way to trick numpy into doing this cast, so that an array v_view
equivalent to v_copy
is created, but without having made a copy? Perhaps working directly on v.__array_interface__
, as is done in np.lib.stride_tricks.as_strided()
?