I'm trying to compile a CUDA project which gives a 255 error as soon as I try to run a function defined in a separate .cu file
This is where the main kernel is defined
#include <curand_kernel.h>
#include <ctime>
#include <stdio.h>
#include "Scene.cuh"
__global__ void fill(float *c, Scene* scene)
{
int index = blockIdx.y * blockDim.x * blockDim.y * gridDim.x +
threadIdx.y * blockDim.x * gridDim.x +
blockIdx.x * blockDim.x + threadIdx.x;
// this is the line which gives the compilation error
float3 result = scene->computeRayFromIndex(index);
c[index * 4 + 0] += 1.0f;
c[index * 4 + 1] += 1.0f;
c[index * 4 + 2] += 1.0f;
c[index * 4 + 3] += 1.0f;
}
Here is scene.cuh
#ifndef Scene_h
#define Scene_h
#include "cuda_runtime.h"
class Scene {
public:
Scene();
__host__ __device__ float3 computeRayFromIndex(int);
int width;
int height;
int cameraType;
private:
};
#endif
And scene.cu
#include "Scene.cuh"
Scene::Scene() {
}
__host__ __device__ float3 Scene::computeRayFromIndex(int pixelIndex) {
float3 test;
return test;
}
I'm using visual studio 2013 and I'm adding the cuda files to my project as usual from the menu
This is the compilation error
Error 10 error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include" -G --keep-dir x64\Debug -maxrregcount=0 --machine 64 --compile -cudart static -g -DWIN32 -DWIN64 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /FS /Zi /RTC1 /MDd " -o x64\Debug\fillRandomTexture.cu.obj "D:\CUDA\projects\vRay\vRay\fillRandomTexture.cu"" exited with code 255.
The project builds and runs fine if I comment out
float3 result = scene->computeRayFromIndex(index);
In the main kernel file
In CUDA, when we want to call a device code function from another device code function, and those two device code functions are in separate compilation units, it's necessary to enable relocatable device code generation and linking when compiling such a project.
In visual studio, this can be set for the entire project from the project properties page, as indicated here:
Also, when working with visual studio and CUDA, the error "MSB3721" is a non-specific error from visual studio indicating "I ran
nvcc
and it returned an error". However, the actual error fromnvcc
should occur prior to this. If you don't see it in the compile output window immediately prior to the "MSB3721" error, then your verbosity level is too low. You can increase it, and the exact method to do so will vary slightly by VS version so I recommend doing a search for how to do that, for your specific version.