-->

MATLAB: How to set random seed in parfor to produc

2019-02-10 10:19发布

问题:

I set up the following minimal example:

rng(0);

randseedoffset = random('unid', 10^5) + 1;

t = cell(10,1);
for i = 1:10
    rng(randseedoffset+i);
    t{i} = random('unid', 1000);
end

disp(t);

This will generate 10 random numbers and store them in t. It will always produce the same random numbers reliably because I set the seed with rng in the for loop.

If I now change for to parfor, I get different results! Though they will also always be reproducible.

I want to accelerate my code with parfor and still obtain the same exact same random numbers as with for...

回答1:

Ok, I just found the reason:

MATLAB supports different random number genereation algorithms. While in the usual setting of the current version this is the Mersenne Twister. When you go into the parfor loop, this changes to what they call 'Combined Recursive Method'.

The problem can be fixed by explicitely setting the type to 'twister' in the loop:

parfor i = 1:10
    rng(randseedoffset+i, 'twister');
    t{i} = random('unid', 1000);
end


回答2:

try this:

p = gcp; % Get or open a pool

numWork = p.NumWorkers; % Get the number of workers

stream = RandStream('mrg32k3a','seed',mydata.seed);
RandStream.setGlobalStream(stream);

% s = RandStream.create('mrg32k3a','NumStreams',numWork,'CellOutput',true,'Seed',mydata.seed); % create numWork independent streams

n = 200; % number of values to generate on each worker
spmd
RandStream.setGlobalStream(stream);
x = rand(1,n);
end


回答3:

I feel the need to elaborate on this. Do not reset the seed in a parfor loop and furthermore do not use the Mersenne Twister algorithm in parallel (you will get poor results of statistical independence).

The reason that you get different results is because the algorithm is different due to the statistical properties which these numbers should maintain. In a parallel pool MATLAB will set the algorithm to 'combRecursive' and set a different subStream on each worker, so for random numbers you are good to go. Furthermore, the parfor loop does not guarantee—

  • The order in which the loops proceed,
  • which workers will be executing each piece, or
  • how many of the iterations are performed on each worker.

Therefore generating random numbers in parfor loops will generally not return the same random numbers even with the same state on each worker. Instead make a RandStream with subStreams of the combRecursive algorithm, set the global stream on each worker in a spmd block, then generate the numbers on each worker in a spmd block:

p = gcp; % Get or open a pool

numWork = p.NumWorkers; % Get the number of workers

s = RandStream.create('mrg32k3a','NumStreams',numWork,...
    'CellOutput',true); % create numWork independent streams

n = 200; % number of values to generate on each worker
spmd
    RandStream.setGlobalStream(s{labindex});
    x = rand(1,n);
end

% I generate row vectors as the Composite matrix x will return a 
% comma-separated list using the syntax, x{:}, which can then be 
% concatenated into a single vector:
randVals2 = [x{:}]';