Randomly generate a set of m integers from an arra

2020-03-21 08:10发布

问题:

Full Question is:

Write a method to randomly generate a set of m integers from an array of size n. Each element must have equal probability of being chosen`

This question is picked from "Crack the coding interview" and the solution is this:

We can swap the element with an element at the beginning of the array and then “remember” that the array now only includes elements j and greater. That is, when we pick subset[0] to be array[k], we replace array[k] with the first element in the array. When we pick subset[1], we consider array[0] to be “dead” and we pick a random element y between 1 and array size(). We then set subset[1] equal to array[y], and set array[y] equal to array[1]. Elements 0 and 1 are now “dead”. Subset[2] is now chosen from array[2] through array[array size()], and so on.

My question is that if we're shrinking the array from which we're picking random numbers then probability of each number being picked 1/remaining_num_elements. How does it stay equal for all the elements?

回答1:

Think about it like you're picking m random numbers from a bag of n numbers, with the first j elements representing the numbers in your hand and the remaining elements representing the numbers still in the bag. (You iterate j from 0 to m - 1 to pull out numbers, as your book suggests. j, then, represents the number of integers that you have already pulled out of the bag.)

If you're picking m integers from a bag in real life, then every time you pick a new number, you take from the bag only and not from your hand. Thus, the remaining_num_elements shrinks at each step.

When you think this way, it should be simple to see that this method guarantees that each element has equal probability of being chosen.



回答2:

if we're shrinking the array from which we're picking random numbers then probability of each number being picked 1/remaining_num_elements.

Yes, you're right but 1/remaining_num_elements is the probability of an element getting picked up in a particular turn. Here we are interested in probability of an element ultimately getting picked up in m turns. which remains same for all the n elements.

Question you need to ask is: Do each of n elements get a fair and equal chance of getting picked up in m turns? Answer is yes because,

P(an element gets finally picked up in set of m elements) = P(element gets picked up in 1st turn) +
P(element doesn't get picked up in 1st turn) * P(element gets picked up in 2nd turn) +
P(element doesn't get picked up in 1st 2 turns) * P(element gets picked up in 3rd turn) + ... and so on

which, If you calculate, remains same for all the n elements initially present.



回答3:

The difference you are seeing in the probability is due to the fact what it is conditional property ( the fact something was selected already and in the last fetch this element was not selected or fetched already). However, over probability , or equal fairness to select any given ball overall does not change.