Efficient way to generate lots of random numbers

2019-08-12 09:14发布

问题:

I have a java method that has to generate lots of random numbers in a very short period of time. My first approach was to use Math.random (which works really fast), but I have the presumption that because I call the Math.random so quick on behind the other, the "random" isn't really random (or less random) because of that (but I need it to be as random as possible).

I now have two questions:

  1. Is my presumption right, that because of the number of calls in a very short period of time the random output gets less random? And if the answer for 1. is Yes:
  2. What would be the fastest way (per call) to remove the problem with the less randomness?

I have already played around with the SecureRandom, but it is minimum 15 times slower than the normal Math.random, which is too slow for my requirements.

回答1:

TL;DR: Your presumption is wrong.

Math.random acts on a single instance on java.util.Random:

Returns a double value with a positive sign, greater than or equal to 0.0 and less than 1.0. Returned values are chosen pseudorandomly with (approximately) uniform distribution from that range.

When this method is first called, it creates a single new pseudorandom-number generator, exactly as if by the expression

new java.util.Random()

From the JavaDoc

Now, java.util.Random uses a linear congruential formula that is seeded with a number that is "very likely to be distinct from any other invocation of this constructor."1

As this is a pseudorandom progression - i.e. it will give exactly the same values from the same seed - the speed at which you extract numbers from Math.random has no impact on their randomness.



回答2:

Random numbers using the Random class use an algorithm that bit mangles an int to give you a new int. It will use the same algorithm regardless of how quickly or how many times you call it. The progression is the progression.

To test this, seed it with a number, like 42. Then watch the progression. Seed it with the same number again. Same exact progression.

The downside to this approach is that the numbers are not TRULY random. They're pretty random, and good enough for most things, but not perfectly random.

I ran the output of the Random method through the die hard battery of tests. It passed most of them with flying colors, one it was borderline, and one it just flat failed. That's the kind of random we're talking about.

Plus, because it uses a date time stamp to seed itself, it is somewhat predictable in some circumstances. Picture someone that boots up and runs your task every Monday morning first thing for that week. There is some predictability because it will run with a timestamp of Monday morning between 8 and 8:30.

So, Random is good enough for most operations that don't have to do with security. Even a lot of them.

SecureRandom, on the other hand, will generate truly random numbers. It does this by looking at system timings and other things that vary from second to second based on a myriad of factors.

The downside is that these factors only change so often in a second, so SecureRandom can only generate a finite number of random numbers in a period of time. It does try to generate some ahead of time and cache them for use, but you can blow the cache.

In this way, it's like my reverse osmosis water filter. It holds a gallon of water that it has already filtered. If you use the whole gallon of water in one shot, then you get it at the rate it filters it--something like 1 ounce per 5 seconds or some such. The first gallon is fast, then it's really slow.



回答3:

If you can use Java8, I recommend the java.utils.SplitableRandom. It is faster and has better statistical distribution. In my test java.utils.SplitableRandom is 30 times faster than java.utils.Random.

I used tobijdc answer to write this answer.



回答4:

While Random is likely good enough, you can improve on Math.random() by using a function which closer to what you need. e.g.

Random rand = new Random();

for ( loop ) {
   int dice = rand.nextInt(6) + 1;

This is much faster than using Math.random() but if you need a long

long l = rand.nextLong();

In this case l has 64-bits of randomness but Math.random() has 53-bits at best (actually it has only 48-bits)



回答5:

If you need lots of numbers quickly (as for simulation or Monte Carlo integration), then a Cryptographically Secure RNG won't be fast enough. java.util.Random() is fast, but a very poor quality PRNG. What you need is a high-quality fast PRNG, like Mersenne Twister or XorShift. Take a look at http://xorshift.di.unimi.it/ , for example, or my own ojrandlib.



回答6:

Try the java kiss library AESPRNG internal generator. It is thread safe and about twice as fast as Random when used in bulk requests, and can produce 128 bit cryptographically strong (but repeatable, if you reset the seed) pseudo-random numbers. It is based on AES CTR mode, which is highly optimized on most systems.

kiss.util.AESPRNG prng = new kiss.util.AESPRNG();
double [] x = new double [1_000_000];
prng.nextDoubles(x,0,x.length);

If you want a repeatable sequence, use seed(byte[] value16) or seed(double value). To reset the sequence. It is a drop-in replacement for Random, but with a number of convenience methods for ranges or bulk numbers. It is really just better than any of the suggested alternatives: fast, repeatable, and 128-bit strongly random.



回答7:

  1. The (pseudo) random number generator produces the same results given the same initial seed values regardless of the frequency with which it is called. It is entirely deterministic and independent of speed. The selection of the seed is dependent on the time (if not explicitly specified), but not the sequence generated.
  2. If you need faster speed, you can pre-compute the values of a pseudo random number sequence larger than the length you need, and then use just one call to he generator to select a starting position in the sequence. This way, you can simply read out values after one initial call on all subsequent runs. Your performance will be limited by the speed at which you can index and read the memory holding the table. Depending on your application, potential reuse of the sequence may not be advisable.