I would like to write a matlab script which runs in parallel using multiple CPUS. The script should then print out a sequence of normally distributed random numbers. At the moment my script looks like this:
matlabpool close force local
clusterObj = parcluster;
matlabpool(clusterObj);
parfor K = 1:10
disp(randn)
end
It prints out a sequence of random numbers as expected. However, when I run the code again it, once again, prints out that exact same sequence of numbers. I do not want this. Each time I run my script it should print out an independently random sequence of numbers. Similarly, each time I start matlab, my script should, when I run it for the first time, print out a different sequence of 10 randomly generated numbers. How do I do this?
The solutions given so far are really not correct and may even be bad ideas. One should avoid setting the seed of the generator repeatedly. More importantly, two streams created separately with different seeds are not necessarily independent. This is addressed on this page that describes the creation of multiple streams:
For generator types that do not explicitly support independent streams, different seeds provide a method to create multiple streams. However, using a generator specifically designed for multiple independent streams is a better option, as the statistical properties across streams are better understood.
Thus, to guarantee the best statistical properties it is best to use a generator that supports substreams. Unfortunately, only the multiplicative lagged Fibonacci generator ('mlfg6331_64'
) and combined multiple recursive generator ('mrg32k3a'
) currently support this property. Compared to the default Mersenne Twister generator ('mt19937ar'
) these have significantly smaller periods. Here is how you would go about creating and using a random number stream with substreams:
seed = 1;
n = 10;
[stream{1:n}] = RandStream.create('mrg32k3a','NumStreams',n,'Seed',seed);
parfor k = 1:n
r = randn(stream{k},[1 3]);
disp(r);
end
Several things. You may get much better performance simply generating all of your random numbers in one call outside of your loop. This will also allow you to use the default Mersenne Twister algorithm, which may be important if, for example, you plan on doing large-scale Monte Carlo simulations. If you're going to be working with random numbers (and parallelization) I recommend that you spend some time reading the documentation for the RandStream
class and going through the examples here.
Reset the random number generator used by rand, randi, and randn to its default startup settings, so that rand produces the same random numbers as if you restarted MATLAB®:
rng('default')
rand(1,5)
ans =
0.8147 0.9058 0.1270 0.9134 0.6324
Save the settings for the random number generator used by rand, randi, and randn, generate 5 values from rand, restore the settings, and repeat those values:
s = rng;
u1 = rand(1,5)
u1 =
0.0975 0.2785 0.5469 0.9575 0.9649
rng(s);
u2 = rand(1,5)
u2 =
0.0975 0.2785 0.5469 0.9575 0.9649
Reinitialize the random number generator used by rand, randi, and randn with a seed based on the current time. rand returns different values each time you do this. Note that it is usually not necessary to do this more than once per MATLAB session as it may affect the statistical properties of the random numbers MATLAB produces:
rng('shuffle');
rand(1,5);
I would try different generators:
rng('shuffle', generator)
rng('shuffle', generator) additionally specify the type of the random number generator used by rand, randi, and randn. The generator input is one of:
'twister' Mersenne Twister
'combRecursive' Combined Multiple Recursive
'multFibonacci' Multiplicative Lagged Fibonacci
'v5uniform' Legacy MATLAB® 5.0 uniform generator
'v5normal' Legacy MATLAB 5.0 normal generator
'v4' Legacy MATLAB 4.0 generator
You can set the random seed to a different value for each iteration:
matlabpool close force local
clusterObj = parcluster;
matlabpool(clusterObj);
rng('shuffle');
seeds = round(10000*abs(randn(10,1)));
parfor K = 1:10
rng(seeds(K))
disp(randn)
end
Some of the random number generators store a value that is essentially an index into the sequence of psuedo-random numbers for a particular seed value.
When running in parallel the different CPUs could be overlaying each others setting of that index value.
You could pre-allocate a vector of random numbers using one CPU and then exec the parallel for loop that pulls numbers from that vector.