Create Random Number Sequence with No Repeats

2019-01-01 04:01发布

Duplicate:

Unique random numbers in O(1)?

I want an pseudo random number generator that can generate numbers with no repeats in a random order.

For example:

random(10)

might return 5, 9, 1, 4, 2, 8, 3, 7, 6, 10

Is there a better way to do it other than making the range of numbers and shuffling them about, or checking the generated list for repeats?


Edit:

Also I want it to be efficient in generating big numbers without the entire range.


Edit:

I see everyone suggesting shuffle algorithms. But if I want to generate large random number (1024 byte+) then that method would take alot more memory than if I just used a regular RNG and inserted into a Set until it was a specified length, right? Is there no better mathematical algorithm for this.

28条回答
明月照影归
2楼-- · 2019-01-01 04:27

As you generate your numbers, use a Bloom filter to detect duplicates. This would use a minimal amount of memory. There would be no need to store earlier numbers in the series at all.

The trade off is that your list could not be exhaustive in your range. If your numbers are truly on the order of 256^1024, that's hardly any trade off at all.

(Of course if they are actually random on that scale, even bothering to detect duplicates is a waste of time. If every computer on earth generated a trillion random numbers that size every second for trillions of years, the chance of a collision is still absolutely negligible.)

查看更多
残风、尘缘若梦
3楼-- · 2019-01-01 04:28

In order to ensure that the list doesn't repeat, it would have to keep a list of numbers previously returned. As it has to therefore generate the entire list by the end of the algorithm, this is equivalent in storage requirement to generating the ordered list and then shuffling.

More about shuffling here: Creating a random ordered list from an ordered list

However, if the range of the random numbers is very large but the quantity of numbers required is small (you've hinted that this is the actual requirement in a comment), then generate a complete list and shuffling it is wasteful. A shuffle on a huge array involves accessing pages of virtual memory in a way that (by definition) will defeat the OS's paging system (on a smaller scale the same problem would occur with the CPU's memory cache).

In this case, searching the list-so-far will be much more efficient. So the ideal would be to use heuristics (determined by experiment) to pick the right implementation for the given arguments. (Apologies for giving examples in C# rather than C++ but ASFAC++B I'm training myself to think in C#).

IEnumerable<int> GenerateRandomNumbers(int range, int quantity)
{
    int[] a = new int[quantity];

    if (range < Threshold)
    {
        for (int n = 0; n < range; n++)
            a[n] = n;

        Shuffle(a);
    }
    else
    {
        HashSet<int> used = new HashSet<int>();

        for (int n = 0; n < quantity; n++)
        {
            int r = Random(range);

             while (!used.Add(r))
                 r = Random(range);

             a[n] = r;
        }
    }

    return a;
}

The cost of doing the checking for repeated numbers, the looping while there are collisions, etc. will be expensive, but there will likely be some Threshold value where it becomes faster than allocating for the entire range.

For sufficiently small quantity requirements, it may be faster to use an array for used and do linear searches in it, due to the greater locality, lower overhead, the cheapness of the comparison...

Also for large quantities AND large ranges, it might be preferable to return an object that produces the numbers in the sequence on request, instead of allocating the array for the results upfront. This is very easy to implement in C# thanks to the yield return keyword:

IEnumerable<int> ForLargeQuantityAndRange(int quantity, int range)
{
    for (int n = 0; n < quantity; n++)
    {
        int r = Random(range);

        while (!used.Add(r))
            r = Random(range);

        yield return r;
    }
}
查看更多
路过你的时光
4楼-- · 2019-01-01 04:28

I understand tou don't want a shuffle for large ranges, since you'd have to store the whole list to do so.

Instead, use a reversible pseudo-random hash. Then feed in the values 0 1 2 3 4 5 6 etc in turn.

There are infinite numbers of hashes like this. They're not too hard to generate if they're restricted to a power of 2, but any base can be used.

Here's one that would work for example if you wanted to go through all 2^32 32 bit values. It's easiest to write because the implicit mod 2^32 of integer math works to your advantage in this case.

unsigned int reversableHash(unsigned int x)
{
   x*=0xDEADBEEF;
   x=x^(x>>17);
   x*=0x01234567;
   x+=0x88776655;
   x=x^(x>>4);
   x=x^(x>>9);
   x*=0x91827363;
   x=x^(x>>7);
   x=x^(x>>11);
   x=x^(x>>20);
   x*=0x77773333;
   return x;
}
查看更多
千与千寻千般痛.
5楼-- · 2019-01-01 04:31

Please check answers at

Generate sequence of integers in random order without constructing the whole list upfront

and also my answer lies there as

 very simple random is 1+((power(r,x)-1) mod p) will be from 1 to p for values of x from 1 to p and will be random where r and p are prime numbers and r <> p.
查看更多
登录 后发表回答