Is it possible to modify the numpy.random.choice function in order to make it return the index of the chosen element? Basically, I want to create a list and select elements randomly without replacement
import numpy as np
>>> a = [1,4,1,3,3,2,1,4]
>>> np.random.choice(a)
>>> 4
>>> a
>>> [1,4,1,3,3,2,1,4]
a.remove(np.random.choice(a))
will remove the first element of the list with that value it encounters (a[1]
in the example above), which may not be the chosen element (eg, a[7]
).
Here's one way to find out the index of a randomly selected element:
Or you could retrieve the element and the index in a single step:
If you want a sample without replacement, just ask numpy to make you one. Don't loop and draw items repeatedly. That'll produce bloated code and horrible performance.
Example:
On a sufficiently recent NumPy (at least 1.17), you should use the new randomness API, which fixes a longstanding performance issue where the old API's
replace=False
code path unnecessarily generated a complete permutation of the input under the hood:Regarding your first question, you can work the other way around, randomly choose from the index of the array
a
and then fetch the value.But if you just need random sample without replacement,
replace=False
will do. Can't remember when it was firstly added torandom.choice
, might be 1.7.0. So if you are running very oldnumpy
it may not work. Keep in mind the default isreplace=True
Based on your comment:
it sounds to me like you're interested in working with
a
aftern
randomly selected elements are removed. Instead, why not work withN = len(a) - n
randomly selected elements froma
? Since you want them to still be in the original order, you can select from indices like in @CTZhu's answer, but then sort them and grab from the original list:So now you can save that as
a
again:and work with
a
withn
elements removed.Maybe late but it worth to mention this solution because I think the simplest way to do so is:
It means you are choosing from the indices uniformly. In a more general case, you can do a weighted sampling (and return the index) in this way:
If you try to do so for so many times (e.g. 1e5), the histogram of the chosen indices would be like
[0.30126 0.39817 0.19986 0. 0.10071]
in this case which is correct.Anyway, you should choose from the indices and use the values (if you need) as their probabilities.
This is a bit in left field compared with the other answers, but I thought it might help what it sounds like you're trying to do in a slightly larger sense. You can generate a random sample without replacement by shuffling the indices of the elements in the source array :
This will create a sample (here, of size 10) by drawing elements from the source set (here, of size 100) without replacement.
You can interact with the non-selected elements by using the remaining index values, i.e.: