I was wondering whether there is a way to refer data from many different arrays to one array, but without copying it.
Example:
import numpy as np
a = np.array([2,3,4,5,6])
b = np.array([5,6,7,8])
c = np.ndarray([len(a)+len(b)])
offset = 0
c[offset:offset+len(a)] = a
offset += len(a)
c[offset:offset+len(b)] = b
However, in the example above, c
is a new array, so that if you modify some element of a
or b
, it is not modified in c
at all.
I would like that each index of c
(i.e. c[0]
, c[1]
, etc.) refer to each element of both a
and b
, but like a pointer, without making a deepcopy
of the data.
As @Jaime says, you can't generate a new array whose contents point to elements in multiple existing arrays, but you can do the opposite:
import numpy as np
c = np.arange(2, 9)
a = c[:5]
b = c[3:]
print(a, b, c)
# (array([2, 3, 4, 5, 6]), array([5, 6, 7, 8]), array([2, 3, 4, 5, 6, 7, 8]))
b[0] = -1
print(c,)
# (array([ 2, 3, 4, -1, 6, 7, 8]),)
I think the fundamental problem with what you're asking for is that numpy arrays must be backed by a continuous block of memory that can be regularly strided in order to map memory addresses to the individual array elements.
In your example, a
and b
will be allocated within non-adjacent blocks of memory, so there will be no way to address their elements using a single set of strides.