Why defining class headers without CUDA __device__

2019-09-08 07:19发布

问题:

I have a .h file with the following declarations:

class Foo{
public:
    inline int getInt();
};

and my .cu file defines the following:

__device__ int Foo::getInt(){
   return 42;
}

This is pretty awesome, because althought I cannot actually call getInt from host, I can include the .h file in .cpp files so I have the type declaration visible for the host. But for me it doesn't seem it should work, so why I dont need to put the __device__ attribute on the .h file?

回答1:

If it works, it should not. It is probably a bug in a CUDA compiler and it might get fixed in the future - so do not rely on it.

However, if you want the class to be visible for the host (and non-cuda compiler), but you have some __device__ functionality which you don't need on the host, you can always encapsulate those functions with the #ifdef __CUDACC__ -- #endif. The __CUDACC__ is predefined when compiling with nvcc, otherwise it is not. So you can write in your header something like:

class Foo{
public:
#ifdef __CUDACC__
    inline __device__ int getInt();
#endif
};

If you are afraid of having too many preprocessor ifdefs, you can also do a trick as follows:

#ifdef __CUDACC__
#define HOST __host__
#define DEVICE __device__
#else
#define HOST
#define DEVICE
#endif

...

class Foo{
public:
    inline HOST DEVICE int getInt();
};


回答2:

Change it to the following:

__device__ int Foo::getInt(){
   return 42;
}

The problem is the return type of the function. It isn't void it is int.