I have a combinatorics problem for which I want to be able to pick an integer at random between 0 and a big integer.
Inadequacies of my current approach
Now for regular integers I would usually write something like int rand 500;
and be done with it.
But for big integers, it looks like rand
isn't meant for this.
Using the following code, I ran a simulation of 2 million calls to rand $bigint
:
$ perl -Mbigint -E 'say int rand 1230138339199329632554990773929330319360000000 for 1 .. 2e6' > rand.txt
The distribution of the resultant set is far from desirable:
- 0 (56 counts)
- magnitude 1e+040 (112 counts)
- magnitude 1e+041 (1411 counts)
- magnitude 1e+042 (14496 counts)
- magnitude 1e+043 (146324 counts)
- magnitude 1e+044 (1463824 counts)
- magnitude 1e+045 (373777 counts)
So the process was never able to choose a number like 999
, or 5e+020
, which makes this approach unsuitable for what I want to do.
It looks like this has something to do with the arbitrary precision of rand
, which never goes beyond 15 digits in the course of my testing:
$ perl -E 'printf "%.66g", rand'
0.307037353515625
How can I overcome this limitation?
My initial thought is that maybe there is a way to influence the precision of rand
, but it feels like a band-aid to a much bigger problem (i.e. the inability of rand
to handle big integers).
In any case, I'm hoping someone has walked down this path before and knows how to remedy the situation.
I was looking at this problem from the wrong angle
The bins are not the same size. Each bin is 10 times the size of the previous one. To put this in perspective, there are 10,000 possible integers at magnitude
1e+44
for every integer with magnitude1e+40
.The probability of finding any number with magnitude
1e+20
for the bigint at1e+45
is less than0.00000 00000 00000 00000 001 %
.Forget needles in haystacks, this is more like finding a needle in a quasar!
(Converted from my comment)
A more theoretical-driven approach would be using multiple calls to the PRNG to create enough random-bits for your number to sample. Care has to be taken, if the number of bits produced by some PRNG is not equal to the number of bits needed as outlined below!
Pseudocode
n_needed_bits
n_bits_prng
needed_prng_samples = ceil(n_needed_bits / n_bits_prng)
needed_prng_samples
(calls to PRNG) times & concatenate all the bits obtainedRemarks
n_possible-sample-numbers-of-full-concatenation / n_possible-sample-numbers-within-range
An approach can be to cut string representation of the number into chunks, a boolean ($low) initialized is false while first random draws are equal to upper bound.
EDIT: added some explanations following comment
The test
Returns