NumPy proposes a way to get the index of the maximum value of an array via np.argmax
.
I would like a similar thing, but returning the indexes of the N maximum values.
For instance, if I have an array, [1, 3, 2, 4, 5]
, function(array, n=3)
would return [4, 3, 1]
.
bottleneck
has a partial sort function, if the expense of sorting the entire array just to get the N largest values is too great.I know nothing about this module; I just googled
numpy partial sort
.I found it most intuitive to use
np.unique
.The idea is, that the unique method returns the indices of the input values. Then from the max unique value and the indicies, the position of the original values can be recreated.
Method
np.argpartition
only returns the k largest indices, performs a local sort, and is faster thannp.argsort
(performing a full sort) when array is quite large. But the returned indices are NOT in ascending/descending order. Let's say with an example:We can see that if you want a strict ascending order top k indices,
np.argpartition
won't return what you want.Apart from doing a sort manually after np.argpartition, my solution is to use PyTorch,
torch.topk
, a tool for neural network construction, providing NumPy-like APIs with both CPU and GPU support. It's as fast as NumPy with MKL, and offers a GPU boost if you need large matrix/vector calculations.Strict ascend/descend top k indices code will be:
Note that
torch.topk
accepts a torch tensor, and returns both top k values and top k indices in typetorch.Tensor
. Similar with np, torch.topk also accepts an axis argument so that you can handle multi-dimensional arrays/tensors.The following is a very easy way to see the maximum elements and its positions. Here
axis
is the domain;axis
= 0 means column wise maximum number andaxis
= 1 means row wise max number for the 2D case. And for higher dimensions it depends upon you.Use:
It also works with 2D arrays. For example,
Use:
For regular Python lists:
If you use Python 2, use
xrange
instead ofrange
.Source: heapq — Heap queue algorithm