what is the most efficient way to pick a random ca

2019-01-18 01:23发布

问题:

I have an array which tells whether a card is in use:

int used[52];

This is a terrible way to pick a random card if I have many used cards:

do {
  card = rand() % 52;
} while (used[card]);

since if I have only 3-4 unused cards, it'll take forever to find them.

I came up with this:

 int card;
 int k = 0;
 int numUsed = 0;
 for (k=0; k < 52; ++k) {
   if (used[k]) numUsed += 1;
 }
 if (numUsed == 52) return -1;
 card = rand() % (52 - numUsed);

 for (k=0; k < 52; ++k) {
   if (used[k]) continue;
   if (card == 0) return k;
   card -= 1;
 }

which I guess works better if the deck is full, but works worse when the deck is empty since I have to go through two for loops.

What's the most efficient way to do this?

回答1:

I think your two-pass algorithm is likely to be the best you can do, given the constraint you added in a comment that you don't know in advance which cards are eligible for a given draw.

You could try the cunning "select at random from a list of unknown size in a single pass" algorithm:

int sofar = 0;
int selected = -1;
for (i = 0; i < 52; ++i) {
    if (used[i]) continue;
    ++sofar;
    if ((rand() % sofar) == 0) selected = i;
}
if (selected == -1) panic; // there were no usable cards 
else used[selected] = 1;   // we have selected a card

Then if (as you say in a comment) different draws have different criteria, you can replace used[i] with whatever the actual criteria are.

The way it works is that you select the first card. Then you replace it with the second card with probability 1/2. Replace the result with the third card with probability 1/3, etc. It's easy to prove by induction that after n steps, the probability of each of the preceding cards being the selected one, is 1/n.

This method uses lots of random numbers, so it's probably slower than your two-pass version unless getting each item is slow, or evaluating the criteria is slow. It'd normally be used e.g. for selecting a random line from a file, where you really don't want to run over the data twice. It's also sensitive to bias in the random numbers.

It's good and simple, though.

[Edit: proof

Let p(j,k) be the probability that card number j is the currently-selected card after step k.

Required to prove: for all n, p(j,n) = 1/n for all 1 <= j <= n

For n = 1, obviously p(1,1) = 1, since the first card is selected at the first step with probability 1/1 = 1.

Suppose that p(j,k) = 1/k for all 1 <= j <= k.

Then we select the (k+1)th card at step (k+1) with probability 1/(k+1), i.e p(k+1,k+1) = 1/(k+1).

We retain the existing selection with probability k/(k+1), so for any j < k+1:

p(j,k+1) = p(j,k) * k/(k+1)
         = 1/k    * k/(k+1)   // by the inductive hypothesis
         = 1/(k+1)

So p(j,k+1) = 1/(k+1) for all 1 <= k <= k+1

Hence, by induction, for all n: p(j,n) = 1/n for all 1 <= j <= n]



回答2:

Why don't you just keep another collection of unused cards?

If you want them in random order, you can first shuffle them (Fisher-Yates), then pop them off as you need them.



回答3:

The best way to do this is to shuffle the deck into a random order, and then pick the first unused card. Here's the most common way to perform a shuffle like this.



回答4:

The standard algorithm for dealing random cards is.

  • initialise the deck to contain all cards (order not important)
  • loop:
  • generate random index in range 0 to deck-size - 1
  • display card at that index (or do whatever you want)
  • swap indexed card in deck with the card at [deck-size -1]
  • reduce deck-size by one
  • goto loop: as often as required


回答5:

You could get rid of the two loops using a code like:

int card;
int k = 0;
int i = 0;
int unUsed[52];
int numUsed = 0;
for (k = 0; k < 52; ++k) {
  if (used[k]) {
    numUsed += 1;
  } else {
    unUsed[i] = k;
    i++;
  }
}
if (numUsed == 52) return -1;
card = rand() % (52 - numUsed);
return unUsed[card];

Though I would imagine the increase in efficiency wouldn't be big, and you will be using more memory.



回答6:

The other option would be to have two lists, use one to track the used cards and one to track the unused cards. So if you use a card, subtract it from unused card lists and add it to the end of the used card list. This way, you won't have to run two for loops every time.



回答7:

Keep used cards in the end of the array and unused cards in the beginning. Keep track of how many cards have not been used yet. When a new card is used, swap it with the last unused card and decrement the number of remaining cards.

if (numRemaining == 0) return -1;
int cardNum = rand() % numRemaining;
Card card = cards[cardNum]; // or int, if cards are represented by their numbers
cards[cardNum] = cards[numRemaining - 1];
cards[numRemaining - 1] = card;
numRemaining--;


回答8:

Knuth Shuffle in many languages



回答9:

I'm not sure if this will produce truly random draws, but it would avoid loops through the entire deck in almost all cases. I'm even less sure how it would compare performance wise, but here it is nonetheless:

  • get random card from deck
  • if card is already used, pick a random direction (forward or backward)
  • step trough the deck from current position in the randomly determined direction until you find the next unused card (of course you have to make sure you're properly wrapping around at the ends of the array)

So in the worst case, you pick a card right next to the last unused one and then step through the deck in the 'wrong' direction, thus performing a full loop through the deck.