I want to generate 10000 integer random numbers between 0 and 10^12. Usually, the code would look like this:
x <- sample(0:1000000000000,10000,replace=T)
But I get following error message:
Error in 0:1000000000000 : result would be too long a vector
Is there a more memory efficient method that doesn't have to put 10^12 integers in a vector just to get a sample of size 10000? If not, is there a way to increase the max size of the vector? I'm working on a 64bit OS with 12GB of free RAM.
I do not understand why you cannot just do...
N.B. This does not mean that
sample
has to generate the vector1:x
!! @James points out that for sampling of0:x
you will need to adjust tosample(10^12+1,10,replace=TRUE)-1
The package
extraDistr
provides a range of additional probability distributions to sample from, including a discrete uniform distribution.Random sampling with function
rdunif
works like otherstats
random sampling functions included with R likerunif
, and avoids needing to round as in other solutions:FYI:
as.integer
performs a truncation, not a rounding.In order to test if it works you can try by generating numbers in a smaller interval (i.e. from 0 to 6) and visualize the histogram of the result to see if the result is a uniform distribution, i.e.
The real problem lies in the fact that you cannot store the sequence of
0:10^12
into memory. By just defining 0 and 10^12 as boundaries of a uniform distribution, you could get what you seek:This will draw from the uniform distribution (with replacement, though I doubt that matters).
However, what you cannot see is that these are actually floating numbers.
You can use
ceiling
to round them up:So the full code would be:
Further nitpicking:
Note that this technically will not allow 0 to be there (since 0.0001 would be rounded up), so you could just draw from
As Carl Witthoft mentions, numbers that do not fit into the size of an integer will not be integers obviously, so you cannot count on these numbers to be integers. You can still count on them to evaluate to
TRUE
when compared to the same floating number without decimals though.