Simple parallel execution in MATLAB

2020-04-01 03:47发布

问题:

I have figured out some awesome ways of speeding up my MATLAB code: vectorizing, arrayfun, and basically just getting rid of for loops (not using parfor). I want to take it to the next step.

Suppose I have 2 function calls that are computationally intensive.

x = fun(a);
y = fun(b);

They are completely independent, and I want to run them in parallel rather than serially. I dont have the parallel processing toolbox. Any help is appreciated.

thanks

回答1:

If I am optimistic I think you ask "How Can I simply do parallel processing in Matlab". In that case the answer would be:

Parallel processing can most easily be done with the parallel computing toolbox. This gives you access to things like parfor.

I guess you can do:

parfor t = 1:2
   if t == 1, x = fun(a); end
   if t == 2, y = fun(b); end
end

Of course there are other ways, but that should be the simplest.



回答2:

The MATLAB interpreter is single-threaded, so the only way to achieve parallelism across MATLAB functions is to run multiple instances of MATLAB. Parallel Computing Toolbox does this for you, and gives you a convenient interface in the form of PARFOR/SPMD/PARFEVAL etc. You can run multiple MATLAB instances manually, but you'll probably need to do a fair bit of work to organise the work that you want to be done.



回答3:

The usual examples involve parfor, which is probably the easiest way to get parallelism out of MATLAB's Parallel Computing Toolbox (PCT). The parfeval function is quite easy, as demonstrated in this other post. A less frequently discussed functionality of the PCT is the system of jobs and tasks, which are probably the most appropriate solution for your simple case of two completely independent function calls. Spoiler: the batch command can help to simplify creation of simple jobs (see bottom of this post).

Unfortunately, it is not as straightforward to implement; for the sake of completeness, here's an example:

% Build a cluster from the default profile
c = parcluster();

% Create an independent job object
j = createJob(c);

% Use cells to pass inputs to the tasks
taskdataA = {field1varA,...};
taskdataB = {field1varB,...};

% Create the task with 2 outputs
nTaskOutputs = 2;
t = createTask(j, @myCoarseFunction, nTaskOutputs, {taskdataA, taskdataB});

% Start the job and wait for it to finish the tasks
submit(j); wait(j);

% Get the ouptuts from each task
taskoutput = get(t,'OutputArguments');

delete(j); % do not forget to remove the job or your APPDATA folder will fill up!

% Get the outputs
out1A = taskoutput{1}{1};
out1B = taskoutput{2}{1};

out2A = taskoutput{1}{2};
out2B = taskoutput{2}{2};

The key here is the function myCoarseFunction given to createTask as the function to evaluate in the task objects to creates. This can be your fun or a wrapper if you have complicated inputs/outputs that might require a struct container.

Note that for a single task, the entire workflow above of creating a job and task, then starting them with submit can be simplified with batch as follows:

c = parcluster();
jobA = batch(c, @myCoarseFunction, 1, taskdataA,...
    'Pool', c.NumWorkers / 2 - 1, 'CaptureDiary', true);

Also, keep in mind that as with matlabpool(now called parpool), using parcluster requires time to startup the MATLAB.exe processes that will run your job.