The buildtime of my cuda library is increasing and so I thought that separate compilation introduced in CUDA 5.0 might help me. I couldn't figure out how to achieve separate compilation with cmake. I looked into the NVCC documentation and found how to compile device object (using the -dc option) and how to link them (using the -dlink). My attempts to get it running using cmake failed. I'm using cmake 2.8.10.2 and the head of the trunk of the FindCUDA.cmake. I couldn't however find out how to specify which files should be compiled and how to link them into a library.
Especially the syntax of the function(CUDA_LINK_SEPARABLE_COMPILATION_OBJECTS output_file_var cuda_target options object_files source_files)
is unclear to me because I don't know what the output_file_var
and the cuda_target
are.
Here the not working results of my attemps:
cuda_compile(DEVICEMANAGER_O devicemanager.cu OPTIONS -dc)
cuda_compile(BLUB_O blub.cu OPTIONS -dc)
CUDA_LINK_SEPARABLE_COMPILATION_OBJECTS(TEST_O gpuacceleration
"" DEVICEMANGER_O BLUB_O)
set(LIB_TYPE SHARED)
#cuda_add_library(gpuacceleration ${LIB_TYPE}
#${gpuacc_SRCS}
#devicemanager.cu
# blub.cu
#DEVICEMANAGER_O
# TEST_O
#)
Does anyone know how to compile and link a cuda library using cmake? Thanks in advance.
EDIT: After a friend consulted the developer of the FindCUDA.cmake, a bug got fixed in the example provided with FindCUDA.cmake (https://gforge.sci.utah.edu/gf/project/findcuda/scmsvn/?action=browse&path=%2Fcheckout%2Ftrunk%2FFindCuda.html). I'm now able to build the example. In my project I can build the library as needed using the following (cmake 2.8.10 required):
set(LIB_TYPE SHARED)
set(CUDA_SEPARABLE_COMPILATION ON)
cuda_add_library(gpuacceleration ${LIB_TYPE}
blub.cu
blab.cu
)
BUT: I cannot link against this library. When I builded the lib without separate compilation i was able to link against it. Now getting the following error:
undefined reference to `__cudaRegisterLinkedBinary_53_tmpxft_00005ab4_00000000_6_blub_cpp1_ii_d07d5695'
for every file with a function used in the interface. Seems strange since it builds without any warning etc. Any ideas how to get this working?
EDIT: I finally figured out how to do this. See @PHD's and my answer for details.
Tested it with nvcc version:
and svn revision:
In this example includes following classes:
kernel.cu
contains a simple CUDA kernel and a class with a public method to call the CUDA kernel. The class lib contains an instance of the class kernel and a method calling the public method of class kernel.Following
CMakeLists.txt
works with this configuration:I finally got it running ;)
In Addition to the answer of @PHD and my comment on it I modified:
set(BUILD_SHARED_LIBS OFF)
in myCMakeLists.txt
since shared libs are not supported for separate compilation according to the nvcc manually v5.0 page 40.In addition to that use the latest rev (1223) from the repository instead of rev 1221. I contacted the developer and he fixed some issue blocking this. This revision doesn't set the
nvcc
-arch=sm_xx
flag correctly, so I added this manually for my project and informed the developer of FindCUDA.cmake. So this might get fixed in the future.Don't forget to get cmake > 2.8.10 for this to work.
Hope this helps anyone but me ;)
Here is my CMakeLists.txt:
EDIT: this is not working! The problem is that there are undefined references to all cuda functions (eg. cudaMalloc) when linking the generated library when building a executable in the main project.
Still working on it
I couldn't make it works using CUDA_ADD_EXECUTABLE so I created a function that makes a custom target to do so.
Now, to generate a lib, just use:
Or this to generate an executable:
The last param is a list of libs to be attached.
I hope it helps.
EDIT (2016-03-15): Yes, it is confirmed as a bug in FindCUDA: https://cmake.org/Bug/view.php?id=15157
TL;DR: This seems to be a bug in FindCUDA, which makes objects loose info on external definitions before the final linking.
The problem is that, even if separable compilation is enabled, a linking step is still performed for all the targets individually before the final linking.
For instance, I have
module.cu
with:and
module.h
with:and finally
main.cu
with:This then works fine with the following
Makefile
:But then I try to use the following
CMakeLists.txt
:When then compiling, what then happens is that the following:
Clearly, the problem are the
nvcc -dlink obj.o -o obj_intermediate_link.o
lines. Then, I guess, the info on external definitions are lost. So, the question is, it is possible to make CMake/FindCUDA not do this extra linking step?Otherwise, I would argue that this is a bug. Do you agree? I can file a bug report with CMake.