Choose m elements randomly from a vector containin

2019-03-14 20:23发布

问题:

I have a vector containing n elements. I need to choose a subset of m elements randomly from the vector without repetition. What is the most efficient way of doing this? I need to do this several thousands of times in my code.

The solution on top of my mind is to use rand() to generate a random number k between 0 and n. Then pick the kth element in the vector and insert it into a std::set. Keep doing this till the set's size becomes equal to m. I am now assured that the set contains m unique elements randomly chosen from the set of n elements.

What are the other possible solutions?

Thanks.

回答1:

You want a Fisher-Yates shuffle (stop after M iterations):

template<class BidiIter >
BidiIter random_unique(BidiIter begin, BidiIter end, size_t num_random) {
    size_t left = std::distance(begin, end);
    while (num_random--) {
        BidiIter r = begin;
        std::advance(r, rand()%left);
        std::swap(*begin, *r);
        ++begin;
        --left;
    }
    return begin;
}

Demo at http://ideone.com/3A3cv. This is significantly faster than std::random_shuffle when you only need a few random numbers out of the set, and should be just about the same speed even if N==M.



回答2:

One way you could do this is to create a list of all the indices of the vector, shuffle them, and take the first n to be the indices of the selected objects:

struct rangegenerator {
    rangegenerator(int init) : start(init) { }

    int operator()() {
        return start++;
    }

    int start;
};

vector<T> numbers; // this is filled somewhere else

vector<int> indices(numbers.size());

generate(begin(indices), end(indices), rangegenerator(0));

random_shuffle(begin(indices), end(indices));

// then take the first n elements of indices and use them as indices into numbers