I have a vector containing n
elements. I need to choose a subset of m
elements randomly from the vector without repetition. What is the most efficient way of doing this? I need to do this several thousands of times in my code.
The solution on top of my mind is to use rand()
to generate a random number k
between 0
and n
. Then pick the k
th element in the vector and insert it into a std::set
. Keep doing this till the set's size becomes equal to m
. I am now assured that the set contains m
unique elements randomly chosen from the set of n
elements.
What are the other possible solutions?
Thanks.
You want a Fisher-Yates shuffle (stop after M iterations):
Demo at http://ideone.com/3A3cv. This is significantly faster than
std::random_shuffle
when you only need a few random numbers out of the set, and should be just about the same speed even ifN==M
.One way you could do this is to create a list of all the indices of the vector, shuffle them, and take the first
n
to be the indices of the selected objects: