I've been using Random (java.util.Random)
to shuffle a deck of 52 cards. There are 52! (8.0658175e+67) possibilities. Yet, I've found out that the seed for java.util.Random
is a long
, which is much smaller at 2^64 (1.8446744e+19).
From here, I'm suspicious whether java.util.Random
is really that random; is it actually capable of generating all 52! possibilities?
If not, how can I reliably generate a better random sequence that can produce all 52! possibilities?
Let me apologize in advance, because this is a little tough to understand...
First of all, you already know that
java.util.Random
is not completely random at all. It generates sequences in a perfectly predictable way from the seed. You are completely correct that, since the seed is only 64 bits long, it can only generate 2^64 different sequences. If you were to somehow generate 64 real random bits and use them to select a seed, you could not use that seed to randomly choose between all of the 52! possible sequences with equal probability.However, this fact is of no consequence as long as you're not actually going to generate more than 2^64 sequences, as long as there is nothing 'special' or 'noticeably special' about the 2^64 sequences that it can generate.
Lets say you had a much better PRNG that used 1000-bit seeds. Imagine you had two ways to initialize it -- one way would initialize it using the whole seed, and one way would hash the seed down to 64 bits before initializing it.
If you didn't know which initializer was which, could you write any kind of test to distinguish them? Unless you were (un)lucky enough to end up initializing the bad one with the same 64 bits twice, then the answer is no. You could not distinguish between the two initializers without some detailed knowledge of some weakness in the specific PRNG implementation.
Alternatively, imagine that the
Random
class had an array of 2^64 sequences that were selected completely and random at some time in the distant past, and that the seed was just an index into this array.So the fact that
Random
uses only 64 bits for its seed is actually not necessarily a problem statistically, as long as there is no significant chance that you will use the same seed twice.Of course, for cryptographic purposes, a 64 bit seed is just not enough, because getting a system to use the same seed twice is computationally feasible.
EDIT:
I should add that, even though all of the above is correct, that the actual implementation of
java.util.Random
is not awesome. If you are writing a card game, maybe use theMessageDigest
API to generate the SHA-256 hash of"MyGameName"+System.currentTimeMillis()
, and use those bits to shuffle the deck. By the above argument, as long as your users are not really gambling, you don't have to worry thatcurrentTimeMillis
returns a long. If your users are really gambling, then useSecureRandom
with no seed.If you consider the number as just an array of bits (or bytes) then maybe you could use the (Secure)
Random.nextBytes
solutions suggested in this Stack Overflow question, and then map the array into anew BigInteger(byte[])
.A very simple algorithm is to apply SHA-256 to a sequence of integers incrementing from 0 upwards. (A salt can be appended if desired to "get a different sequence".) If we assume that the output of SHA-256 is "as good as" uniformly distributed integers between 0 and 2256 - 1 then we have enough entropy for the task.
To get a permutation from the output of SHA256 (when expressed as an integer) one simply needs to reduce it modulo 52, 51, 50... as in this pseudocode:
In general, a pseudorandom number generator (PRNG) can't choose from among all permutations of a 52-item list if its state length is less than 226 bits.
java.util.Random
implements an algorithm with a modulus of 248; thus its state length is only 48 bits, so much less than the 226 bits I referred to. You will need to use another PRNG with a bigger state length — specifically, one with a period of 52 factorial or greater.See also "Shuffling" in my article on random number generators.
This consideration is independent of the nature of the PRNG; it applies equally to cryptographic and noncryptographic PRNGs (of course, noncryptographic PRNGs are inappropriate whenever information security is involved).
Although
java.security.SecureRandom
allows seeds of unlimited length to be passed in, theSecureRandom
implementation could use an underlying PRNG (e.g., "SHA1PRNG" or "DRBG"). And it depends on that PRNG's period (and to a lesser extent, state length) whether it's capable of choosing from among 52 factorial permutations. (Note that I define "state length" as the "maximum size of the seed a PRNG can take to initialize its state without shortening or compressing that seed").Short solution which is essentially the same of dasblinkenlight:
You don't need to worry about the internal state. Long explanation why:
When you create a
SecureRandom
instance this way, it accesses an OS specific true random number generator. This is either an entropy pool where values are accessed which contain random bits (e.g. for a nanosecond timer the nanosecond precision is essentially random) or an internal hardware number generator.This input (!) which may still contain spurious traces are fed into a cryptographically strong hash which removes those traces. That is the reason those CSPRNGs are used, not for creating those numbers themselves! The
SecureRandom
has a counter which traces how many bits were used (getBytes()
,getLong()
etc.) and refills theSecureRandom
with entropy bits when necessary.In short: Simply forget objections and use
SecureRandom
as true random number generator.I'm going to take a bit of a different tack on this. You're right on your assumptions - your PRNG isn't going to be able to hit all 52! possibilities.
The question is: what's the scale of your card game?
If you're making a simple klondike-style game? Then you definitely don't need all 52! possibilities. Instead, look at it like this: a player will have 18 quintillion distinct games. Even accounting for the 'Birthday Problem', they'd have to play billions of hands before they'd run into the first duplicate game.
If you're making a monte-carlo simulation? Then you're probably okay. You might have to deal with artifacts due to the 'P' in PRNG, but you're probably not going to run into problems simply due to a low seed space (again, you're looking at quintillions of unique possibilities.) On the flip side, if you're working with large iteration count, then, yeah, your low seed space might be a deal-breaker.
If you're making a multiplayer card game, particularly if there's money on the line? Then you're going to need to do some googling on how the online poker sites handled the same problem you're asking about. Because while the low seed space issue isn't noticeable to the average player, it is exploitable if it's worth the time investment. (The poker sites all went through a phase where their PRNGs were 'hacked', letting someone see the hole cards of all the other players, simply by deducing the seed from exposed cards.) If this is the situation you're in, don't simply find a better PRNG - you'll need to treat it as seriously as a Crypto problem.