I'm computing a function f(x) = exp(-x) in Matlab, where x is a vector of scalars. The function is computed on GPU, e.g.
x_cpu = [4 5 11 1];
x = gpuArray(x_cpu);
f = exp(-x);
then the result would be:
f = exp(-[4, 5, 11, 1]) = [0.183, 0.0067, 1.6702e-005, 0.3679].
Note that f(x(3)) = f(11) = exp(-11) = 1.6702e-005 = 0.000016702, which is a pretty small value. So, I would like to avoid computing the function for all x(i) > 10 by simply setting f(x(i)) = 0.
I can probably use the sparse matrix representation for x. However, the Parallel Computing Toolbox does not support operations on sparse matrices on GPU.
How would you approach this?
While the Parallel Computing Toolbox does not support sparse matrix operations on the GPU, Jacket does. So one possible approach is to simply use the different tool.
Disclaimer is that I work on Jacket, but I really do think it would be beneficial to you on this since it supports the things you want to do and that PCT does not do, and for reasons here.
PLEASE NOTE: This approach is a workaround meant to address the statement in the question:
So, I would like to avoid computing the function for all x(i) > 10 by
simply setting f(x(i)) = 0.
In no way is this a truly "sparse" numerical method. This is simply a means to "avoid computing the function for all x(i) > 10" on the GPU in MATLAB
% original input vector
x_cpu = [4 5 10 1 13 8 9];
% logical indeces of x where exp(-x) is significant
ix = x_cpu <= 10;
% values of x where exp(-x) is significant ("sparse" x)
x_sp = x_cpu(ix);
% Load our "sparse" vector to GPU
x_gpu = gpuArray(x_sp);
% create a vector of zeros for function output on GPU
f_gpu = parallel.gpu.GPUArray.zeros(size(x_cpu));
% do the calculations only for the "sparse" matrix on the GPU
f_gpu(ix) = exp(-x_gpu);
For when you want to get your computations back in the workspace, use gather:
f_cpu = gather(f_gpu); % GPU --> workspace
NOTE: I have not tested this code
You should combine some of these initializations (x_sp
or ix
, maybe) to conserve memory and speed up the process. Honestly, the initializations and the transfer of data between the workspace and the GPU might actually make this whole process slower than before. Nothing left to do but try it!