可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'm computing a function f(x) = exp(-x) in Matlab, where x is a vector of scalars. The function is computed on GPU, e.g.

x_cpu = [4 5 11 1];
x = gpuArray(x_cpu);
f = exp(-x);

then the result would be:

f = exp(-[4, 5, 11, 1]) = [0.183, 0.0067, 1.6702e-005, 0.3679].

Note that f(x(3)) = f(11) = exp(-11) = 1.6702e-005 = 0.000016702, which is a pretty small value. So, I would like to avoid computing the function for all x(i) > 10 by simply setting f(x(i)) = 0.

I can probably use the sparse matrix representation for x. However, the Parallel Computing Toolbox does not support operations on sparse matrices on GPU.

How would you approach this?

回答1:

While the Parallel Computing Toolbox does not support sparse matrix operations on the GPU, Jacket does. So one possible approach is to simply use the different tool.

Disclaimer is that I work on Jacket, but I really do think it would be beneficial to you on this since it supports the things you want to do and that PCT does not do, and for reasons here.

回答2:

PLEASE NOTE: This approach is a workaround meant to address the statement in the question:

So, I would like to avoid computing the function for all x(i) > 10 by simply setting f(x(i)) = 0.

In no way is this a truly "sparse" numerical method. This is simply a means to "avoid computing the function for all x(i) > 10" on the GPU in MATLAB

% original input vector
x_cpu = [4 5 10 1 13 8 9]; 

% logical indeces of x where exp(-x) is significant
ix = x_cpu <= 10;

% values of x where exp(-x) is significant ("sparse" x)
x_sp = x_cpu(ix);

% Load our "sparse" vector to GPU
x_gpu = gpuArray(x_sp);

% create a vector of zeros for function output on GPU
f_gpu = parallel.gpu.GPUArray.zeros(size(x_cpu)); 

% do the calculations only for the "sparse" matrix on the GPU
f_gpu(ix) = exp(-x_gpu);

For when you want to get your computations back in the workspace, use gather:

f_cpu = gather(f_gpu);         % GPU --> workspace

NOTE: I have not tested this code

You should combine some of these initializations (x_sp or ix, maybe) to conserve memory and speed up the process. Honestly, the initializations and the transfer of data between the workspace and the GPU might actually make this whole process slower than before. Nothing left to do but try it!