I'm trying to get the indices to sort a multidimensional array by the last axis, e.g.
>>> a = np.array([[3,1,2],[8,9,2]])
And I'd like indices i
such that,
>>> a[i]
array([[1, 2, 3],
[2, 8, 9]])
Based on the documentation of numpy.argsort I thought it should do this, but I'm getting the error:
>>> a[np.argsort(a)]
IndexError: index 2 is out of bounds for axis 0 with size 2
Edit: I need to rearrange other arrays of the same shape (e.g. an array b
such that a.shape == b.shape
) in the same way... so that
>>> b = np.array([[0,5,4],[3,9,1]])
>>> b[i]
array([[5,4,0],
[9,3,1]])
I found the answer here, with someone having the same problem. They key is just cheating the indexing to work properly...
You can also use
linear indexing
, which might be better with performance, like so -So,
a.argsort(1)+(np.arange(M)[:,None]*N)
basically are the linear indices that are used to mapb
to get the desired sorted output forb
. The same linear indices could also be used ona
for getting the sorted output fora
.Sample run -
Rumtime tests -
Solution:
You got it right, though I wouldn't describe it as cheating the indexing.
Maybe this will help make it clearer:
i
is the order that we want, for each row. That is:To do both indexing steps at once, we have to use a 'column' index for the 1st dimension.
Another array that could be paired with
i
is:If
i
identifies the column for each element, thenj
specifies the row for each element. The[[0],[1]]
column array works just as well because it can be broadcasted againsti
.I think of
as 'short hand' for
j
. Together they define the source row and column of each element of the new array. They work together, not sequentially.The full mapping from
a
to the new array is:The above answers are now a bit outdated, since new functionality was added in numpy 1.15 to make it simpler; take_along_axis (https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.take_along_axis.html) allows you to do: