I have a .h file with the following declarations:
class Foo{
public:
inline int getInt();
};
and my .cu file defines the following:
__device__ int Foo::getInt(){
return 42;
}
This is pretty awesome, because althought I cannot actually call getInt
from host, I can include the .h file in .cpp files so I have the type declaration visible for the host. But for me it doesn't seem it should work, so why I dont need to put the __device__
attribute on the .h file?
If it works, it should not. It is probably a bug in a CUDA compiler and it might get fixed in the future - so do not rely on it.
However, if you want the class to be visible for the host (and non-cuda compiler), but you have some __device__
functionality which you don't need on the host, you can always encapsulate those functions with the #ifdef __CUDACC__
-- #endif
. The __CUDACC__
is predefined when compiling with nvcc, otherwise it is not. So you can write in your header something like:
class Foo{
public:
#ifdef __CUDACC__
inline __device__ int getInt();
#endif
};
If you are afraid of having too many preprocessor ifdefs, you can also do a trick as follows:
#ifdef __CUDACC__
#define HOST __host__
#define DEVICE __device__
#else
#define HOST
#define DEVICE
#endif
...
class Foo{
public:
inline HOST DEVICE int getInt();
};
Change it to the following:
__device__ int Foo::getInt(){
return 42;
}
The problem is the return type of the function. It isn't void
it is int
.