- I have a numpy matrix with shape of (4601, 58).
- I want to split the matrix randomly as per 60%, 20%, 20% split based on number of rows
- This is for Machine Learning task I need
- Is there a numpy function that randomly selects rows?
相关问题
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- How to get the background from multiple images by
- Evil ctypes hack in python
- Correctly parse PDF paragraphs with Python
If you want to randomly select rows, you could just use
random.sample
from the standard Python library:random.sample
samples without replacement, so you don't need to worry about repeated rows ending up inchoice
. Given a numpy array calledmatrix
, you can select the rows by slicing, like this:matrix[choice]
.Of, course,
k
can be equal to the number of total elements in the population, and thenchoice
would contain a random ordering of the indices for your rows. Then you can partitionchoice
as you please, if that's all you need.you can use numpy.random.shuffle
Since you need it for machine learning, here is a method I wrote:
A complement to HYRY's answer if you want to shuffle consistently several arrays x, y, z with same first dimension:
x.shape[0] == y.shape[0] == z.shape[0] == n_samples
.You can do:
And then proceed with the split of each shuffled array as in HYRY's answer.