I'd like to create a random list of integers for testing purposes. The distribution of the numbers is not important. The only thing that is counting is time. I know generating random numbers is a time-consuming task, but there must be a better way.
Here's my current solution:
import random
import timeit
# Random lists from [0-999] interval
print [random.randint(0, 1000) for r in xrange(10)] # v1
print [random.choice([i for i in xrange(1000)]) for r in xrange(10)] # v2
# Measurement:
t1 = timeit.Timer('[random.randint(0, 1000) for r in xrange(10000)]', 'import random') # v1
t2 = timeit.Timer('random.sample(range(1000), 10000)', 'import random') # v2
print t1.timeit(1000)/1000
print t2.timeit(1000)/1000
v2 is faster than v1, but it is not working on such a large scale. It gives the following error:
ValueError: sample larger than population
Is there a fast, efficient solution that works at that scale?
Some results from the answer
Andrew's: 0.000290962934494
gnibbler's: 0.0058455221653
KennyTM's: 0.00219276118279
NumPy came, saw, and conquered.
Your question about performance is moot—both functions are very fast. The speed of your code will be determined by what you do with the random numbers.
However it's important you understand the difference in behaviour of those two functions. One does random sampling with replacement, the other does random sampling without replacement.
It is not entirely clear what you want, but I would use numpy.random.randint:
which gives on my machine:
Note that randint is very different from random.sample (in order for it to work in your case I had to change the 1,000 to 10,000 as one of the commentators pointed out -- if you really want them from 0 to 1,000 you could divide by 10).
And if you really don't care what distribution you are getting then it is possible that you either don't understand your problem very well, or random numbers -- with apologies if that sounds rude...
Firstly, you should use
randrange(0,1000)
orrandint(0,999)
, notrandint(0,1000)
. The upper limit ofrandint
is inclusive.For efficiently,
randint
is simply a wrapper ofrandrange
which callsrandom
, so you should just userandom
. Also, usexrange
as the argument tosample
, notrange
.You could use
to generate 10,000 numbers in the range using
sample
10 times.(Of course this won't beat NumPy.)
But since you don't care about the distribution of numbers, why not just use:
?
All the random methods end up calling
random.random()
so the best way is to call it directly:For example,
random.randint
callsrandom.randrange
.random.randrange
has a bunch of overhead to check the range before returningistart + istep*int(self.random() * n)
.NumPy is much faster still of course.