Generating millions of non-repeating random number

2019-04-14 03:34发布

I have a question, what algorithm can I use to generate a set of 2^21 random unique numbers in Java? is there another library in java that generates random numbers aside math.random?

Thanks in advance!

标签: java random
3条回答
forever°为你锁心
2楼-- · 2019-04-14 04:14

You could use Format-Preserving Encryption to encrypt a counter. Your counter just goes from 0 upwards, and the encryption uses a key of your choice to turn it into a seemingly random value of whatever radix and width you want.

Block ciphers normally have a fixed block size of e.g. 64 or 128 bits. But Format-Preserving Encryption allows you to take a standard cipher like AES and make a smaller-width cipher, of whatever radix and width you want (e.g. radix 2, width 21 for the parameters of the question), with an algorithm which is still cryptographically robust.

It is guaranteed to never have collisions (because cryptographic algorithms create a 1:1 mapping). It is also reversible (a 2-way mapping), so you can take the resulting number and get back to the counter value you started with.

AES-FFX is one proposed standard method to achieve this.

查看更多
女痞
3楼-- · 2019-04-14 04:19

Take a look at Apache Commons Math API's random pacakge http://commons.apache.org/math/userguide/random.html

查看更多
老娘就宠你
4楼-- · 2019-04-14 04:30

The key question is what do you meen by "numbers"?

Generally though, this problem can be solved by 'generate a list of numbers, put that into random order, take the first 2^21 of them' The first part is trivial The second part can be solved by the fisher yates algorithm The real problem is if you want to use a very large space of numbers. Then you need a lazy solution

Here is what I would do: Use a data structure to represent the list that outwardly looks like an array, but internally is represented using a hashtable based sparse array representation. Furthermore, if when attempting to read from a cell, if you don't hit something in the hash, simply return the index for that cell.

Your modified fisher yates stops at 2^21 for the index variable and uses a random variable j between index and the "length" of the array (number of integers)

This lazy approach generates a random non-repeating list of any kind of number in O(n) time and O(n) space where n is the length of the array you are trying to generate. That is the best you can do.

For an explanation of Fisher-Yates http://en.wikipedia.org/wiki/Fisher-Yates_shuffle

查看更多
登录 后发表回答