I'm in the process of working on programming project that involves some pretty extensive Monte Carlo simulation in Python, and as such the generation of a tremendous number of random numbers. Very nearly all of them, if not all of them, will be able to be generated by Python's built in random module.
I'm something of a coding newbie, and unfamiliar with efficient and inefficient ways to do things. Is it faster to generate say, all the random numbers as a list, and then iterate through that list, or generate a new random number each time a function is called, which will be in a very large loop?
Or some other, undoubtedly more clever method?
Python builtin
random
module, e.g.random.random()
,random.randint()
, (some distributions also available, you probably want gaussian) does about 300K samples/s.Since you are doing numerical computation, you probably use
numpy
anyway, that offers better performance if you cook random number one array at a time instead of one number at a time and wider choice of distributions. 60K/s * 1024 (array length), that's ~60M samples/s.You can also read
/dev/urandom
on Linux and OSX. my hw/sw (osx laptop) manages ~10MB/s.Surely there must be faster ways to generate random numbers en masse, e.g.:
This generates 200MB/s on a single core of i5-4670K
Common ciphers like aes and blowfish manage 112MB/s and 70MB/s on my stack. Furthermore modern processors make aes even faster up to some 700MB/s see this link to test runs on few hardware combinations. (edit: link broken). You could use weaker ECB mode, provided you feed distinct inputs into it, and achieve up to 3GB/s.
Stream cipher are better suited for the task, e.g. RC4 tops out at 300MB/s on my hardware, you may get best results from most popular ciphers as more effort was spent optimising those both and software.
Generate a random number each time. Since the inner workings of the loop only care about a single random number, generate and use it inside the loop.
Example:
Obviously, in practical terms it really doesn't matter, unless you're dealing with billions and billions of iterations, but why bother generating all those numbers if you're only going to be using one at a time?
Code to generate 10M random numbers efficiently and faster:
Time taken included the I/O time lagged in printing on screen:
The code between parentheses will only generate one item at a time, so it's memory safe.