I have a Matlab program that reads in large amount of data from physical file on disk and perform intensive computation like this:
data = load('myfile.dat');
results = intensiveCompute(data);
The computation is done on GPU and takes a very long time. What I'd like to do is to be able to load in data from the next file while the computation is running (since loading file is also a bottleneck). From what I gather so far, this is doable using Mex (e.g. _beginthread etc...). However, if possible it would be ideal to stay within the Matlab environment. Perhaps there's some way to spawn a thread in Matlab to read data and another to perform computation. Any help is greatly appreciated.
I know you mentioned you want to stay within Matlab, and as chappjc suggests you can use the Parallel Computing Toolbox, but most of us don't have lots of toolboxes.
Is your data only in the MAT-file format, or is it available in some other format like CSV or HDF5? If you know Java or have access to someone who can program in it, I would suggest using Java threads, since Matlab runs on Java and has high-performance marshalling of data between Java and MATLAB. Then you don't have to worry about MEX files.
In this answer I detailed an approach using the
task
andjob
functions for asynchronous execution, but I think for a simpleload
thatparfeval
might be easiest. For example,Note: Be sure to allow incoming connections in the Windows firewall for MATLAB.exe, smpd.exe and mpiexec.exe. You should be prompted the fist time a pool is launched (automatically by
parfeval
).Here's a simple example to show how it works:
At this point, we see that the command is still running on the worker. Obviously, we can be doing something more useful than simply checking on the job... but here's what happens after a brief wait: