C++. Visual Studio 2010.
I have a std::vector
V of N unique elements (heavy structs). How can efficiently pick M random, unique, elements from it?
E.g. V contains 10 elements: { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } and I pick three...
- 4, 0, 9
- 0, 7, 8
- But NOT this: 0, 5, 5 <--- not unique!
STL is preferred. So, something like this?
std::minstd_rand gen; // linear congruential engine??
std::uniform_int<int> unif(0, v.size() - 1);
gen.seed((unsigned int)time(NULL));
// ...?
// Or is there a good solution using std::random_shuffle for heavy objects?
Since you wanted it to be efficient, I think you can get an amortised
O(M)
, assuming you have to perform that operation a lot of times. However, this approach is not reentrant.First of all create a local (i.e.
static
) vector ofstd::vector<...>::size_type
(i.e.unsigned
will do) values.If you enter your function, resize the vector to match
N
and fill it with values from the old size toN-1
:Then, randomly pick
M
unique numbers from that vector:Now, your result is sitting in the
result
vector.However, you still have to repair your changes to
indices
for the next run, so thatindices
is monotonic again:But this approach is only useful, if you have to run that algorithm a lot and
N
doesn't grow so big that you cannot live withindices
eating up RAM all the the time.Create a random permutation of the range
0, 1, ..., N - 1
and pick the firstM
of them; use those as indices into your original vector.A random permutation is easily made with the standard library by using
std::iota
together withstd::random_shuffle
:You can supply
random_shuffle
with a random number generator of your choice; check the documentation for details.Most of the time, the method provided by Kerrek is sufficient. But if N is very large, and M is orders of magnitude smaller, the following method may be preferred.
Create a set of unsigned integers, and add random numbers to it in the range [0,N-1] until the size of the set is M. Then use the elements at those indexes.