I'm confused about the results of numpy reshape operated on a view. In the following q.flags shows that it does not own the data, but q.base is neither x nor y, so what is it? I'm surprised to see that q.strides is 8 which means that it gets the next element by every time move 8 bytes in memory (if I understand correctly). However if none of the arrays other than x owns data, the only data buffer is from x, which does not permit getting the next element of q by moving 8 bytes.
In [99]: x = np.random.rand(4, 4)
In [100]: y = x.T
In [101]: q = y.reshape(16)
In [102]: q.base is y
Out[102]: False
In [103]: q.base is x
Out[103]: False
In [104]: y.flags
Out[104]:
C_CONTIGUOUS : False
F_CONTIGUOUS : True
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
In [105]: q.flags
Out[105]:
C_CONTIGUOUS : True
F_CONTIGUOUS : True
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
In [106]: q.strides
Out[106]: (8,)
In [107]: x
Out[107]:
array([[ 0.62529694, 0.20813211, 0.73932923, 0.43183722],
[ 0.09755023, 0.67082005, 0.78412615, 0.40307291],
[ 0.2138691 , 0.35191283, 0.57455781, 0.2449898 ],
[ 0.36476299, 0.36590522, 0.24371933, 0.24837697]])
In [108]: q
Out[108]:
array([ 0.62529694, 0.09755023, 0.2138691 , 0.36476299, 0.20813211,
0.67082005, 0.35191283, 0.36590522, 0.73932923, 0.78412615,
0.57455781, 0.24371933, 0.43183722, 0.40307291, 0.2449898 ,
0.24837697])
UPDATE:
It turns out that this question has been asked in the numpy discussion forum: http://numpy-discussion.10968.n7.nabble.com/OWNDATA-flag-and-reshape-views-vs-copies-td10363.html
In short: you cannot always rely on the
ndarray.flags['OWNDATA']
.Because
q
didn't reflect the change in the first element, likex
ory
, it must somehow be the owner of the data (somehow is explained below).There is more discussion about the
OWNDATA
flag over at the numpy-discussion mailinglist. In the How can I tell if NumPy creates a view or a copy? question, it is briefly mentioned that simply checking theflags.owndata
of anndarray
sometimes seems to fail and that it seems unreliable, as you mention. That's because everyndarray
also has abase
attribute:the base of an ndarray is a reference to another array if the memory originated elsewhere (otherwise, the base is
None
). The operationy.reshape(4)
creates a copy, not a view, because the strides ofy
are(8,16)
. To get it reshaped (C-contiguous) to(4,)
, the memory pointer would have to jump0->16->8->24
, which is not doable with a single stride. Thusq.base
points to the memory location generated by the forced-copy-operationy.reshape
, which has the same shape asy
, but copied elements and thus has normal strides again:(16, 8)
.q.base
is thus not bound to by any other name as it was the result of the forced-copy operationy.reshape(4)
. Only now can the objectq.base
be viewed in a(4,)
shape, because the strides allow this.q
is then indeed a view onq.base
.For most people it would be confusing to see that
q.flags.owndata
isFalse
, because, as shown above, it is not a view ony
. However, it is a view on a copy ofy
. That copy,q.base
, is the owner of the data however. Thus the flags are actually correct, if you inspect closely.I like to use
.__array_interface__
.Transpose was performed by reversing the strides. The base data pointer is the same.
So the
q
data is a copy (different pointer). Strides(8,)
means its elements are accessed by stepping from onef8
to the next. But ax.reshape(16)
is a view ofx
- because its data can be accessed with a simple8
step.To access the original data in the
q
order, it would have to step 32 bytes 3 times (downx
rows), then go back to the start and step 8 to the 2ndx
column, followed by 3 row steps, etc. Since striding doesn't work this way, it has to work from a copy.Note also that
y[0,0]
changesx[0,0]
, butq[0]
is independent of both.While
OWNDATA
forq
is false, it is True fory.ravel()
andy.flatten()
. I suspectreshape()
in this case is making a copy, and then reshaping, and it's the intermediate copy that 'owns' the data,q.base
.