Numpy matrix of coordinates

2020-08-27 05:07发布

问题:

I'm trying to get a matrix of coordinate-arrays. This is different from numpy.meshgrid. For example, for a 2x2 size I'd want the 2x2x2 output

[[[0,0],[0,1]],
 [[1,0],[1,1]]]

as a numpy array. This probably looks and reads cleaner a 2x2 matrix of tuples:

[[(0,0),(0,1)],
 [(1,0),(1,1)]]

(except I don't think you can have tuples in a numpy array, and it's not the point here)

This simple example can be done by switching the axes of numpy-meshgrid's output (specifically, moving the first axis to be last):

np.array(np.meshgrid([0,1],[0,1])).transpose([1,2,0])

This could be easily generalized to arbitrary dimensions, except that meshgrid doesn't behave as I would expect for more than 2 inputs. Specifically, the returned matrices have coordinate values that vary along axes in an odd order:

In [627]: np.meshgrid([0,1],[0,1],[0,1])
Out[627]:
[array([[[0, 0],
        [1, 1]],

       [[0, 0],
        [1, 1]]]),
 array([[[0, 0],
        [0, 0]],

       [[1, 1],
        [1, 1]]]),
 array([[[0, 1],
        [0, 1]],

       [[0, 1],
        [0, 1]]])]

Notice that the elements of this output vary along axes 1, 0, and 2, respectively. This will build an incorrect coordinate matrix; I would need the output to vary along axes 0, 1, and 2, in that order. So I could do

In [642]: np.array(np.meshgrid([0,1],[0,1],[0,1])).swapaxes(1,2)
Out[642]:
array([[[[0, 0],
         [0, 0]],

        [[1, 1],
         [1, 1]]],


       [[[0, 0],
         [1, 1]],

        [[0, 0],
         [1, 1]]],


       [[[0, 1],
         [0, 1]],

        [[0, 1],
         [0, 1]]]])

But this is starting to get really hacky and I don't know if I can count on this order in higher-dimension meshgrid outputs. numpy.mgrid gives the right order, but doesn't seem to allow arbitrary values, which I will need. So this boils down to two questions:

1) Is there a cleaner way, maybe some function in numpy I'm missing, that will generate a matrix of coordinate-vectors as described? 2) Is this odd ordering really what we expect from meshgrid? Is there a spec to this point that I can count on?

[EDIT] Following up on Jaime's solution, here's a more generalized function to build it a little more explicitly for anyone interested: [EDIT 2, fixed a bug, might be another, can't spend much more time on this right now, this really needs to be a more common function...]

def build_coords(*vecs):
    coords = numpy.empty(map(len,vecs)+[len(vecs)])
    for ii in xrange(len(vecs)):
        s = np.hstack((len(vecs[ii]), np.ones(len(vecs)-ii-1)))
        v = vecs[ii].reshape(s)
        coords[...,ii] = v
    return coords

回答1:

Try np.meshgrid([0, 1], [0, 1], [0, 1], indexing="ij"). The meshgrid docs are actually pretty explicit about how the default indexing="xy" produces a funny axis ordering as compared to the non-default indexing="ij", so you can check that for more details. (They're not as clear on why it works this way, alas...)



回答2:

The numpy function indices can also be used to this effect, its functionality being clear from its name as well.

>>> import numpy as np
>>> np.indices((2,3))
array([[[0, 0, 0],
        [1, 1, 1]],

       [[0, 1, 2],
        [0, 1, 2]]])

which can be thought of as a 2 by 3 matrix of y coordinates and a 2 by 3 matrix of x coordinates (y,x = np.indices((2,3))). It can be recast into the form proposed by Jaime by transposing the axes:

>>> np.indices((2,3)).transpose((1,2,0))

It is functionally equivalent to the meshgrid solution, using indexing='ij', but does not require you to provide the coordinate arrays, which can be a benefit when you have many dimensions.

>>> def f1(shape):
...     return np.array(np.meshgrid(*(np.arange(s) for s in shape), indexing='ij'))
...
>>> shape = (200, 31, 15, 4)
>>> np.all(f1(shape) == np.indices(shape))
True

Timing wise, these solutions are similar, when you take into account the time required for generating the 1-D arrays on which meshgrid operates, but meshgrid returns a list (of arrays), not an nd-array like indices. By adding an extra call to np.array as done in f1 above, indices has a clear advantage over meshgrid:

In [14]: %timeit f1(shape)
100 loops, best of 3: 14 ms per loop

In [15]: %timeit np.indices(shape)
100 loops, best of 3: 5.77 ms per loop

Without the extra call to array:

In [16]: def f2(shape):
    return np.meshgrid(*(np.arange(s) for s in shape), indexing='ij')
   .....: 

In [17]: %timeit f2(shape)
100 loops, best of 3: 5.78 ms per loop

Always be careful with interpreting timings though. This likely won't be the bottleneck in any problem you tackle.

In any case, meshgrid can do many more things than indices can, such as generating a more general rectilinear grid instead of a Cartesian grid, so use them when appropriate. In this case, I'd go with the naming-wise more descriptive indices.



回答3:

Given the 1D coords:

rows = np.arange(2)
cols = np.arange(3)

I was hoping that this would do the trick:

np.dstack((rows[:, None, None], cols[:, None]))

But apparently dstack and the like require exactly matching dimensions, they will not broadcast them, which I think is a shame.

So this alternative is a little long, but explicit is better than implicit, and you can always wrap it all into a little function:

>>> coords = np.empty((len(rows), len(cols), 2), dtype=np.intp)
>>> coords[..., 0] = rows[:, None]
>>> coords[..., 1] = cols

>>> coords
array([[[0, 0],
        [0, 1],
        [0, 2]],

       [[1, 0],
        [1, 1],
        [1, 2]]])


回答4:

One simple way that I use is the following-

x,y = np.mgrid[-10:10:0.1, -10:10:0.1]
pos = np.empty(x.shape + (2,))
pos[:, :, 0] = x; pos[:, :, 1] = y
pos = np.reshape(pos, (x.shape[0]*x.shape[1], 2))

The pos is the required coordinates array.