Is it possible to load data file in parallel to co

2019-08-06 08:09发布

I have a Matlab program that reads in large amount of data from physical file on disk and perform intensive computation like this:

data = load('myfile.dat');
results = intensiveCompute(data);

The computation is done on GPU and takes a very long time. What I'd like to do is to be able to load in data from the next file while the computation is running (since loading file is also a bottleneck). From what I gather so far, this is doable using Mex (e.g. _beginthread etc...). However, if possible it would be ideal to stay within the Matlab environment. Perhaps there's some way to spawn a thread in Matlab to read data and another to perform computation. Any help is greatly appreciated.

2条回答
仙女界的扛把子
2楼-- · 2019-08-06 08:28

I know you mentioned you want to stay within Matlab, and as chappjc suggests you can use the Parallel Computing Toolbox, but most of us don't have lots of toolboxes.

Is your data only in the MAT-file format, or is it available in some other format like CSV or HDF5? If you know Java or have access to someone who can program in it, I would suggest using Java threads, since Matlab runs on Java and has high-performance marshalling of data between Java and MATLAB. Then you don't have to worry about MEX files.

查看更多
做个烂人
3楼-- · 2019-08-06 08:35

In this answer I detailed an approach using the task and job functions for asynchronous execution, but I think for a simple load that parfeval might be easiest. For example,

f = parfeval(@load,1,'myfile.dat'); % asynchronous, move on to intensiveCompute
results = intensiveCompute(data);
data = fetchOutputs(f); % Blocks until complete

Note: Be sure to allow incoming connections in the Windows firewall for MATLAB.exe, smpd.exe and mpiexec.exe. You should be prompted the fist time a pool is launched (automatically by parfeval).

Here's a simple example to show how it works:

>> x = magic(5);
>> save x.mat x
>> f = parfeval(@load,1,'x.mat');
Starting parallel pool (parpool) using the 'local' profile ... connected ...
>> f
f = 
 FevalFuture with properties: 

                   ID: 1
             Function: @load
                State: running
      ErrorIdentifier: 
         ErrorMessage: 

At this point, we see that the command is still running on the worker. Obviously, we can be doing something more useful than simply checking on the job... but here's what happens after a brief wait:

>> f
f = 
 FevalFuture with properties: 

                   ID: 1
             Function: @load
                State: finished (unread)
      ErrorIdentifier: 
         ErrorMessage: 
>> % all done, load the data
>> data = fetchOutputs(f) % Blocks until complete
data = 
x: [5x5 double]
查看更多
登录 后发表回答