Generating Relocatable Device Code using Nvidia Ns

2020-03-31 02:32发布

问题:

I'm trying to compile a dynamic parallelism example on CUDA and when i try to compile it gives and error saying,

kernel launch from __device__ or __global__ functions requires separate compilation modes

Later found that I have to set the --relocatable-device-code flag to true. But, is there a flag to set in order to make the set relocatable-device-code to true in Nsight Eclipse?

回答1:

If you are not using makefile project, you could change the options passed to nvcc of a Nsight project at the following position, starting from the menu.

Project - Properties - Build - Settings - Tool Settings - NVCC Compiler

As Nsight does not provide a rdc option for you to check, you could directly change 'Commnad' from

nvcc

to

nvcc -rdc=true

or change 'Command line pattern' from

${COMMAND} ${FLAGS} ${OUTPUT_FLAG} ${OUTPUT_PREFIX} ${OUTPUT} ${INPUTS}

to

${COMMAND} ${FLAGS} -rdc=true ${OUTPUT_FLAG} ${OUTPUT_PREFIX} ${OUTPUT} ${INPUTS}

The second one is better.

You may also want to change this for 'All configurations' rather than 'Debug' or 'Release' only.

EDIT

You should follow @RobertCrovella's instructions in the comment. It is the official way.



回答2:

After a project is created, you can also make this change by going to Project...Properties...Build...Settings. Here you will see a page similar to the one mentioned above in the "Basic settings" dialog page. You can similarly change "Device linker mode:" on this page from "Whole program compilation" to "Separate compilation" in order to turn on generation of relocatable device code, after the project has already been created.

Credit goes to @robertcrovella. This was actually the answer I was looking for, so I've made it a separate answer.



回答3:

you can use nvcc option "-dc" or "-rdc=true", you can ref this as as sample.

nvlink, relocatable device code and static device libraries