Using rng('shuffle') function in parallel

2019-06-04 04:07发布

问题:

I have a parfor loop for parallel computing in Matlab. I want have different random numbers in every calling of these parforloops on 8 workers. If i don't use rng('shuffle') function i have same random number for randperm(10). In this case my code run rng('shuffle') function before randperm at the same time in all workers. Have i different random numbers in this condition? when I see randperm outputs in parfor loop, Some of these outputs are same !

I need save rng before rng('shuffle') and use something likes rng(saved_rng) after ending parallel loop?

We have this in Matlab help :

Note Because rng('shuffle') seeds the random number generator based on the current time, you should not use this command to set the random number stream on different workers if you want to assure independent streams. This is especially true when the command is sent to multiple workers simultaneously, such as inside a parfor, spmd, or a communicating job. For independent streams on the workers, use the default behavior; or if that is not sufficient for your needs, consider using a unique substream on each worker.

So what should i do? Have I different random numbers if i delete rng? I have two versions of these codes. One of them is calculation with parfor and other using for loop, Can i remove shuffle from for loop? I have different random numbers in this condition?

Thanks.

Ps.

I can have these structures:

parfor I=1:X
xx = randperm(10)
end


parfor I=1:X
rng('shuffle');
xx = randperm(10)
end

rng('shuffle');
parfor I=1:X
xx = randperm(10)
end

I want have different random numbers from randperm function. How can I do that? for for structure i need shuffle function (without it the random numbers are the same) but when i add it to parfor some random outputs of randperm are the same !

回答1:

To do this properly, you need to choose an RNG algorithm that supports parallel substreams (in other words, you can split up the random stream into substreams, and each of the substreams still has the right statistical properties that you want from a random stream).

The default RNG algorithm (Mersenne Twister, or mt19937ar) does not support parallel substreams, but MATLAB supports two algorithms that do (the multiplicative lagged Fibonacci generator mlfg6331_64 and the combined multiple recursive generator mrg32k3a).

For example:

s = RandStream.create('mrg32k3a','NumStreams',4,'Seed','shuffle','CellOutput',true)

s is now a cell array of random number substreams. All have the same seed, and you can record s{1}.Seed for reproducibility if you want.

Now, you can call rand(s{1}) (or randn(s{1})) to generate random numbers from stream 1, and so on. Reset a stream to its initial configuration with reset(s{1}), and you should find that each stream is separately reproducible.

Each worker can then generate random numbers in a way that is still statistically sound, and reproducible even in parallel:

parfor i = 1:4
    rand(s{i})
end

For more information, look in the documentation for Statistics Toolbox under Speed up Statistical Computations. There are a few articles in there that take you through all the complicated details. If you don't have Statistics Toolbox, the documentation is online on MathWorks website.