Lambda expressions with CUDA

If I use thrust::transform on thrust::host, the lambda usage is fine

thrust::transform(thrust::host, a, a+arraySize,b,d,[](int a, int b)->int
{
    return a + b;
});

However, if I change thrust::host to thrust::device, the code wouldn't pass the compiler. Here is the error on VS2013:

The closure type for a lambda ("lambda [](int, int)->int") cannot be used in the template argument type of a __global__ function template instantiation, unless the lambda is defined within a __device__ or __global__ function

So, the problem is how using __device__ or __global__ in connection to device lambdas.

标签： c++ c++11 lambda cuda

2条回答

贼婆χ

2楼-- · 2020-06-02 17:46

This simple code using device lambdas work under CUDA 8.0 RC, although device lambdas for this version of CUDA are still at an experimental stage:

#include <thrust/device_vector.h>
#include <thrust/functional.h>
#include <thrust/transform.h>

using namespace thrust::placeholders;

int main(void)
{
    // --- Input data 
    float a = 2.0f;
    float x[4] = { 1, 2, 3, 4 };
    float y[4] = { 1, 1, 1, 1 };

    thrust::device_vector<float> X(x, x + 4);
    thrust::device_vector<float> Y(y, y + 4);

    thrust::transform(X.begin(), 
                      X.end(),  
                      Y.begin(), 
                      Y.begin(),
                      [=] __host__ __device__ (float x, float y) { return a * x + y; }      // --- Lambda expression 
                     );        

    for (size_t i = 0; i < 4; i++) std::cout << a << " * " << x[i] << " + " << y[i] << " = " << Y[i] << std::endl;

    return 0;
}

Remember to use

--expt-extended-lambda

for compilation.

0人赞添加讨论(0) 举报

看我几分像从前

3楼-- · 2020-06-02 17:47

In CUDA 7 it is not possible. Quoting from Mark Harris:

That isn't supported today in CUDA, because the lambda is host code. Passing lambdas from host to device is a challenging problem, but it is something we will investigate for a future CUDA release.

What you can do in CUDA 7 is call thrust algorithms from your device code, and in that case you can pass lambdas to them...

With CUDA 7, thrust algorithms can be called from device code (e.g. CUDA kernels, or __device__ functors). In those situations, you can use (device) lambdas with thrust. An example is given in the parallelforall blog post here.

However, CUDA 7.5 introduces an experimental device lambda feature. This feature is described here:

CUDA 7.5 introduces an experimental feature: GPU lambdas. GPU lambdas are anonymous device function objects that you can define in host code, by annotating them with a __device__ specifier.

In order to enable compilation for this feature, (currently, with CUDA 7.5) it's necessary to specify --expt-extended-lambda on the nvcc compile command line.

0人赞添加讨论(0) 举报

Lambda expressions with CUDA

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间