numpy.random.choice vs random.choice

2020-03-19 06:20发布

问题:

Why does numpy.random.choice not work the same as random.choice? When I do this :

 >>> random.choice([(1,2),(4,3)])
 (1, 2)

It works.

But when I do this:

 >>> np.random.choice([(1,2), (3,4)])
 Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "mtrand.pyx", line 1393, in mtrand.RandomState.choice 
 (numpy/random/mtrand/mtrand.c:15450)
 ValueError: a must be 1-dimensional

How do I achieve the same behavior as random.choice() in numpy.random.choice()?

回答1:

Well np.random.choice as noted in the docs, expects a 1D array and your input when expressed as an array would be 2D. So, it won't work simply like that.

To make it work, we can feed in the length of the input and let it select one index, which when indexed into the input would be the equivalent one from random.choice, as shown below -

out = a[np.random.choice(len(a))] # a is input

Sample run -

In [74]: a = [(1,2),(4,3),(6,9)]

In [75]: a[np.random.choice(len(a))]
Out[75]: (6, 9)

In [76]: a[np.random.choice(len(a))]
Out[76]: (1, 2)

Alternatively, we can convert the input to a 1D array of object dtype and that would allow us to directly use np.random.choice, as shown below -

In [131]: a0 = np.empty(len(a),dtype=object)

In [132]: a0[:] = a

In [133]: a0.shape
Out[133]: (3,)  # 1D array

In [134]: np.random.choice(a0)
Out[134]: (6, 9)

In [135]: np.random.choice(a0)
Out[135]: (4, 3)


回答2:

Relatedly, if you want to randomly sample rows of a 2D matrix like this

x = np.array([[1, 100], [2, 200], [3, 300], [4, 400]])

then you can do something like this:

n_rows = x.shape[0]
x[np.random.choice(n_rows, size=n_rows, replace=True), :]

Should work for a 2D matrix with any number of columns, and you can of course sample however many times you want with the size kwarg, etc.