A Good and SIMPLE Measure of Randomness-第2页回答

What is the best algorithm to take a long sequence of integers (say 100,000 of them) and return a measurement of how random the sequence is?

The function should return a single result, say 0 if the sequence is not all all random, up to, say 1 if perfectly random. It can give something in-between if the sequence is somewhat random, e.g. 0.95 might be a reasonably random sequence, whereas 0.50 might have some non-random parts and some random parts.

If I were to pass the first 100,000 digits of Pi to the function, it should give a number very close to 1. If I passed the sequence 1, 2, ... 100,000 to it, it should return 0.

This way I can easily take 30 sequences of numbers, identify how random each one is, and return information about their relative randomness.

Is there such an animal?

标签： algorithm random

10条回答

时光不老，我们不散

2楼-- · 2019-01-22 06:44

In Computer Vision when analysing textures, the problem of trying to gauge the randomness of a texture comes up, in order to segment it. This is exactly the same as your question, because you are trying to determine the randomness of a sequence of bytes/integers/floats. The best discussion I could find of image entropy is http://www.physicsforums.com/showthread.php?t=274518 .

Basically, its the statistical measure of randomness for a sequence of values.

I would also try autocorrelation of the sequence with itself. In the autocorrelation result, if there is no peaks other than the first value that means there is no periodicity to your input.

0人赞添加讨论(0) 举报

等我变得足够好

3楼-- · 2019-01-22 06:46

As per Knuth, make sure you test the low-order bits for randomness, since many algorithms exhibit terrible randomness in the lowest bits.

0人赞添加讨论(0) 举报

倾城　Initia

4楼-- · 2019-01-22 06:51

As others have pointed out, you can't directly calculate how random a sequence is but there are several statistical tests that you could use to increase your confidence that a sequence is or isn't random.

The DIEHARD suite is the de facto standard for this kind of testing but it neither returns a single value nor is it simple.

ENT - A Pseudorandom Number Sequence Test Program, is a simpler alternative that combines 5 different tests. The website explains how each of these tests works.

If you really need just a single value, you could pick one of the 5 ENT tests and use that. The Chi-Squared test would probably be the best to use, but that might not meet the definition of simple.

Bear in mind that a single test is not as good as running several different tests on the same sequence. Depending on which test you choose, it should be good enough to flag up obviously suspicious sequences as being non-random, but might not fail for sequences that superficially appear random but actually exhibit some pattern.

0人赞添加讨论(0) 举报

傲

5楼-- · 2019-01-22 06:53

What you seek doesn't exist, at least not how you're describing it now.

The basic issue is this:
If it's random then it will pass tests for randomness; but the converse doesn't hold -- there's no test that can verify randomness.

For example, one could have very strong correlations between elements far apart and one would generally have to test explicitly for this. Or one could have a flat distribution but generated in a very non-random way. Etc, etc.

In the end, you need to decide on what aspects of randomness are important to you, and test for these (as James Anderson describes in his answer). I'm sure if you think of any that aren't obvious how to test for, people here will help.

Btw, I usually approach this problem from the other side: I'm given some set of data that looks for all I can see to be completely random, but I need to determine whether there's a pattern somewhere. Very non-obvious, in general.

0人赞添加讨论(0) 举报

上一页 1 2

A Good and SIMPLE Measure of Randomness

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间