I'm trying to compile a dynamic parallelism example on CUDA and when i try to compile it gives and error saying,
kernel launch from __device__ or __global__ functions requires separate compilation modes
Later found that I have to set the --relocatable-device-code
flag to true
. But, is there a flag to set in order to make the set relocatable-device-code
to true
in Nsight Eclipse?
If you are not using makefile project,
you could change the options passed to nvcc
of a Nsight project at the following position, starting from the menu.
Project - Properties - Build - Settings - Tool Settings - NVCC Compiler
As Nsight does not provide a rdc option for you to check, you could directly change 'Commnad' from
nvcc
to
nvcc -rdc=true
or change 'Command line pattern' from
${COMMAND} ${FLAGS} ${OUTPUT_FLAG} ${OUTPUT_PREFIX} ${OUTPUT} ${INPUTS}
to
${COMMAND} ${FLAGS} -rdc=true ${OUTPUT_FLAG} ${OUTPUT_PREFIX} ${OUTPUT} ${INPUTS}
The second one is better.
You may also want to change this for 'All configurations' rather than 'Debug' or 'Release' only.
EDIT
You should follow @RobertCrovella's instructions in the comment. It is the official way.
After a project is created, you can also make this change by going to Project...Properties...Build...Settings. Here you will see a page similar to the one mentioned above in the "Basic settings" dialog page. You can similarly change "Device linker mode:" on this page from "Whole program compilation" to "Separate compilation" in order to turn on generation of relocatable device code, after the project has already been created.
Credit goes to @robertcrovella. This was actually the answer I was looking for, so I've made it a separate answer.
you can use nvcc option "-dc" or "-rdc=true", you can ref this as as sample.
nvlink, relocatable device code and static device libraries