If I use thrust::transform
on thrust::host
, the lambda usage is fine
thrust::transform(thrust::host, a, a+arraySize,b,d,[](int a, int b)->int
{
return a + b;
});
However, if I change thrust::host
to thrust::device
, the code wouldn't pass the compiler. Here is the error on VS2013:
The closure type for a lambda ("lambda [](int, int)->int") cannot be used in the template argument type of a
__global__
function template instantiation, unless the lambda is defined within a__device__
or__global__
function
So, the problem is how using __device__
or __global__
in connection to device lambdas.
This simple code using device lambdas work under CUDA 8.0 RC, although device lambdas for this version of CUDA are still at an experimental stage:
Remember to use
for compilation.
In CUDA 7 it is not possible. Quoting from Mark Harris:
With CUDA 7, thrust algorithms can be called from device code (e.g. CUDA kernels, or
__device__
functors). In those situations, you can use (device) lambdas with thrust. An example is given in the parallelforall blog post here.However, CUDA 7.5 introduces an experimental device lambda feature. This feature is described here:
In order to enable compilation for this feature, (currently, with CUDA 7.5) it's necessary to specify
--expt-extended-lambda
on thenvcc
compile command line.