I'm trying to run mainSift.cpp
from CudaSift on a Nvidia Tesla M2090. First of all, as explained in this question, I had to change from sm_35
to sm_20
the CMakeLists.txt
.
Unfortunatley now this error is returned:
checkMsg() CUDA error: LaplaceMulti() execution failed
in file </ghome/rzhengac/Downloads/CudaSift/cudaSiftH.cu>, line 318 : unknown error.
And this is the LaplaceMulti
code:
double LaplaceMulti(cudaTextureObject_t texObj, CudaImage *results, float baseBlur, float diffScale, float initBlur)
{
float kernel[12*16];
float scale = baseBlur;
for (int i=0;i<NUM_SCALES+3;i++) {
float kernelSum = 0.0f;
float var = scale*scale - initBlur*initBlur;
for (int j=-LAPLACE_R;j<=LAPLACE_R;j++) {
kernel[16*i+j+LAPLACE_R] = (float)expf(-(double)j*j/2.0/var);
kernelSum += kernel[16*i+j+LAPLACE_R];
}
for (int j=-LAPLACE_R;j<=LAPLACE_R;j++)
kernel[16*i+j+LAPLACE_R] /= kernelSum;
scale *= diffScale;
}
safeCall(cudaMemcpyToSymbol(d_Kernel2, kernel, 12*16*sizeof(float)));
int width = results[0].width;
int pitch = results[0].pitch;
int height = results[0].height;
dim3 blocks(iDivUp(width+2*LAPLACE_R, LAPLACE_W), height);
dim3 threads(LAPLACE_W+2*LAPLACE_R, LAPLACE_S);
LaplaceMulti<<<blocks, threads>>>(texObj, results[0].d_data, width, pitch, height);
checkMsg("LaplaceMulti() execution failed\n");
return 0.0;
}
I've read already this question that seems somewhat similar, but I don't understand what the solution means or how to use it for my problem.
Why does the error occur?
The error occurs because you are running code which has features which are not supported on your GPU (texture objects). I am a little surprised that the compiler doesn't generate an error during compilation, but that is another question.
There is no solution except to use supported hardware, or to rewrite the code.
[This answer assembled from comments and added as a community wiki entry to get this answer off the unanswered list for the CUDA tag]