how to get the index of numpy.random.choice? - pyt

2020-05-20 09:20发布

Is it possible to modify the numpy.random.choice function in order to make it return the index of the chosen element? Basically, I want to create a list and select elements randomly without replacement

import numpy as np
>>> a = [1,4,1,3,3,2,1,4]
>>> np.random.choice(a)
>>> 4
>>> a
>>> [1,4,1,3,3,2,1,4]

a.remove(np.random.choice(a)) will remove the first element of the list with that value it encounters (a[1] in the example above), which may not be the chosen element (eg, a[7]).

8条回答
forever°为你锁心
2楼-- · 2020-05-20 09:37

Here's one way to find out the index of a randomly selected element:

import random # plain random module, not numpy's
random.choice(list(enumerate(a)))[0]
=> 4      # just an example, index is 4

Or you could retrieve the element and the index in a single step:

random.choice(list(enumerate(a)))
=> (1, 4) # just an example, index is 1 and element is 4
查看更多
太酷不给撩
3楼-- · 2020-05-20 09:38
numpy.random.choice(a, size=however_many, replace=False)

If you want a sample without replacement, just ask numpy to make you one. Don't loop and draw items repeatedly. That'll produce bloated code and horrible performance.

Example:

>>> a = numpy.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> numpy.random.choice(a, size=5, replace=False)
array([7, 5, 8, 6, 2])

On a sufficiently recent NumPy (at least 1.17), you should use the new randomness API, which fixes a longstanding performance issue where the old API's replace=False code path unnecessarily generated a complete permutation of the input under the hood:

rng = numpy.random.default_rng()
result = rng.choice(a, size=however_many, replace=False)
查看更多
手持菜刀,她持情操
4楼-- · 2020-05-20 09:39

Regarding your first question, you can work the other way around, randomly choose from the index of the array a and then fetch the value.

>>> a = [1,4,1,3,3,2,1,4]
>>> a = np.array(a)
>>> random.choice(arange(a.size))
6
>>> a[6]

But if you just need random sample without replacement, replace=False will do. Can't remember when it was firstly added to random.choice, might be 1.7.0. So if you are running very old numpy it may not work. Keep in mind the default is replace=True

查看更多
萌系小妹纸
5楼-- · 2020-05-20 09:43

Based on your comment:

The sample is already a. I want to work directly with a so that I can control how many elements are still left and perform other operations with a. – HappyPy

it sounds to me like you're interested in working with a after n randomly selected elements are removed. Instead, why not work with N = len(a) - n randomly selected elements from a? Since you want them to still be in the original order, you can select from indices like in @CTZhu's answer, but then sort them and grab from the original list:

import numpy as np
n = 3 #number to 'remove'
a = np.array([1,4,1,3,3,2,1,4])
i = np.random.choice(np.arange(a.size), a.size-n, replace=False)
i.sort()
a[i]
#array([1, 4, 1, 3, 1])

So now you can save that as a again:

a = a[i]

and work with a with n elements removed.

查看更多
Summer. ? 凉城
6楼-- · 2020-05-20 09:43

Maybe late but it worth to mention this solution because I think the simplest way to do so is:

a = [1,4,1,3,3,2,1,4]
n = len(a)
idx = np.random.choice(list(range(n)), p=np.ones(n)/n)

It means you are choosing from the indices uniformly. In a more general case, you can do a weighted sampling (and return the index) in this way:

probs = [.3, .4, .2, 0, .1]
n = len(a)
idx = np.random.choice(list(range(n)), p=probs)

If you try to do so for so many times (e.g. 1e5), the histogram of the chosen indices would be like [0.30126 0.39817 0.19986 0. 0.10071] in this case which is correct.

Anyway, you should choose from the indices and use the values (if you need) as their probabilities.

查看更多
对你真心纯属浪费
7楼-- · 2020-05-20 09:46

This is a bit in left field compared with the other answers, but I thought it might help what it sounds like you're trying to do in a slightly larger sense. You can generate a random sample without replacement by shuffling the indices of the elements in the source array :

source = np.random.randint(0, 100, size=100) # generate a set to sample from
idx = np.arange(len(source))
np.random.shuffle(idx)
subsample = source[idx[:10]]

This will create a sample (here, of size 10) by drawing elements from the source set (here, of size 100) without replacement.

You can interact with the non-selected elements by using the remaining index values, i.e.:

notsampled = source[idx[10:]]
查看更多
登录 后发表回答