How to unit test a pseudo random number generator?

2019-01-22 12:02发布

I have a pseudo random number generator (PRNG) class that I want to unit test. There are two approaches:

  1. write a test case that takes a large amount of samples and test whether they are properly distributed. This approach may lead to a fairly long execution time for the test case;
  2. calculate a small series of samples 'by hand' and verify if the PRNG algorithm reproduces it. This approach may lead to a not random sequence being generated without being noticed;

I would say that the first approach is not really unit testing because it does not perform a white box test of the generator, but on the other hand it properly tests the responsibility of the class. The second approach is more like a real unit test, focusing on the algorithm, but it does not provide as much evidence as to whether the class fulfills its responsibility.

Which approach do you prefer, and why?


13条回答
狗以群分
2楼-- · 2019-01-22 12:07

Plesae be aware: If you have 'invented' your PRNG, you will probably have got it wrong and produced something that has a less than optimal distribution. The basic test for how random your generator is, is the Chi-Square test

查看更多
我想做一个坏孩纸
3楼-- · 2019-01-22 12:08

One method is to pipe its output to PractRand.

If PractRand says a PRNG's output is OK, is the PRNG really OK? I'm not qualified to judge, but what I can say is that PRNG is exacting enough that it deemed unsatisfactory the output of various LFSRs and xor-and-shift algorithms I'd found in the literature or online, and deemed OK the output of R. P. Brent's xorgens.

查看更多
姐就是有狂的资本
5楼-- · 2019-01-22 12:11

I would imagine that ultimately you would want both tests - because you want to be sure that both the following hold true:

(1) the numbers are properly distributed and (2) the specific algorithm is working as expected.

Perhaps the first test could only be run occasionally, whereas the second could be use to check that any code changes hadn't broken the algorithm.

查看更多
虎瘦雄心在
6楼-- · 2019-01-22 12:14

The "randomness" in a pseudo-random number generator is generally expressed as the average number of iterations before a number repeats. There are numerous algorithms that have varying "randomness" and performance. Random.org has a good explanation of some analysis done on their algorithms. Check out the pictures about halfway down the page. It's easy to see in the static images just how random the two algorithms are.

One feature of a PRNG (a true feature, not a bug disguised as a feature) is that it is predictable. For a given seed, the same sequence of numbers should be produced. This is extremely important and necessary for testing and debugging programs that use stochastic (aka, random) methods.

The number sequence should approximate a certain statistical distribution. Test your PRNG by generating a large sequence (say 10^6 numbers) and perform several statistical tests on the sequence, particularly the Chi-Squared test (if the distribution is normal). Make a histogram of your sample and see if it looks like you expect.

If you control how the seed is set, the sequence generated should be the same every time, which is suitable for doing white-box testing. When doing your tests, it's also a good idea to "warm up" the generator by running it for 100 iterations or so before collecting your sample.

查看更多
放我归山
7楼-- · 2019-01-22 12:17

Get another implementation of the same PRNG algorithm, generate a smallish number of longish test cases based on known seeds, and verify that your implementation of the algorithm matches everyone else's implementations. The more data you test, the more chance it does. If you want to be serious, look into how FIPS validation is done for the algorithm.

There's no need to test whether the output is random, since far more research has been done on the algorithm by others than you are capable of reproducing.

If you have invented your own PRNG algorithm then you have a rather different problem, because quite aside from testing your code you also need to test your new algorithm. There are various things to do -- I think the most important are statistical testing on the output, and peer review by other cryptographers. Basically, though, if you were to design a PRNG algorithm without having enough knowledge in the field to know how to test it, then it will be rubbish.

查看更多
登录 后发表回答