I have the following array:
a= array([[ 1, 2, 3],
[ 1, 2, 3],
[ 1, 2, 3])
I understand that np.random,shuffle(a.T)
will shuffle the array along the row, but what I need is it to shuffe each row idependently. How can this be done in numpy? Speed is critical as there will be several million rows.
For this specific problem, each row will contain the same starting population.
Good answer above. But I will throw in a quick and dirty way:
Not very elegant but you can get this job done with just one short line.
Building on my comment to @Hun's answer, here's the fastest way to do this:
This works in-place and can only shuffle rows. If you need more options:
This, however, has the limitation of only working on 2d-arrays. For higher dimensional tensors, I would use:
You can do it with numpy without any loop or extra function, and much more faster. E. g., we have an array of size (2, 6) and we want a sub array (2,2) with independent random index for each column.
It works for any number of dimensions.
yields
while scrambling along the 0-axis:
yields
This works by first swapping the target axis with the last axis:
This is a common trick used to standardize code which deals with one axis. It reduces the general case to the specific case of dealing with the last axis. Since in NumPy version 1.10 or higher
swapaxes
returns a view, there is no copying involved and so callingswapaxes
is very quick.Now we can generate a new index order for the last axis:
Now we can shuffle
b
(independently along the last axis):and then reverse the
swapaxes
to return ana
-shaped result:If you don't want a
return
value and want to operate on the array directly, you can specify the indices to shuffle.If you want a return value as well, you can use
numpy.random.permutation
, in which case replacenp.random.shuffle(a[n])
witha[n] = np.random.permutation(a[n])
.Warning, do not do
a[n] = np.random.shuffle(a[n])
.shuffle
does notreturn
anything, so the row/column you end up "shuffling" will be filled withnan
instead.