What's a good one-pass pseudo-random shuffle?

2020-05-13 16:05发布

问题:

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 8 years ago.

The Fisher-Yates shuffle gives a nice algorithm to shuffle an array A of length n in a single pass:

For k = 1 to n
    Pick a random integer j from k to n
    Swap A[k] and A[j]

After a single pass through this algorithm, the entries of A occur uniformly at random.

A common way to botch this algorithm is to do the following:

For k = 1 to n
    Pick a random integer j from 1 to n
    Swap A[k] and A[j]

The resulting distribution from a single pass through this algorithm is not uniformly random, and there is a nice discussion of what it is at this post: What distribution do you get from this broken random shuffle?

I recently read a delightful article by Diaconis, Fulman and Holmes entitled Analysis of Casino Shelf Shuffling Machines where the authors describe a physical machine that does the following batch shuffle:

For k = 1 to n
    Pick a random integer j from 1 to 10
    Randomly choose to place card k on the top or bottom of stack j

The question the authors address is whether or not this gives a reasonably random ordering after a single pass. The answer is decidedly not. One way to see the flaw in this shuffle is to start with a deck of cards that has n/2 red cards atop of n/2 black cards. The resulting deck after a single pass will have at most 10 clumps of red cards! For n = 52*6, this isn't terribly random. The authors also show that an optimal "guess the next card" strategy for the once shuffled will, on average, correctly guess 9.5 cards, whereas an optimal strategy for a random deck will average only 4.5 cards correctly guessed.

Are there any other interesting single-pass shuffles that achieve near-randomness and/or interesting distributions? I'm especially interested in shuffles similar to the latter that work with batches of entries.

回答1:

If you have a shuffled desk, into which you wish to shuffle a batch of new cards (and you know that none of the cards are duplicates), then I think the following is valid.

ForEach card in batch:
    gap = random(deck.size() + 1)  # choose a gap between cards, before first, or after last.
    deck.insertAt(gap,card)

Distribution

The distribution of random is uniform, and the order of the deck is unchanged, so still uniform. I think the result should be uniform. (My stats is too rusty to be sure).

Time

Assuming that insertAt is O(1) not O(N) - which depends upon the implementeation of deck - the whole routine is O(batch size) - which is the best you can hope for becuase you have to handle each card.