cuda kernels using pthreads Missing Configuration

2019-09-03 13:52发布

问题:

What is the meaining of missing configuration error in cuda ? This below code is a thread function, when I run this code the error obtained is 1 which implies missing configuration error. what is the mistake in this code ?

  void* run(void *args)
  {
   cudaError_t error;
   Matrix *matrix=(Matrix*)args;
    int scalar=2;
   dim3 dimGrid(1,1,1);
   dim3 dimBlock(1024,1,1);
   cudaEvent_t  start,stop;
   cudaSetDevice(0);
   cudaEventCreate(&start);
   cudaEventCreate(&stop);
   cudaEventRecord(start,0);
   for(int i=0 ;i< matrix->number ;i++ )
   {
   syntheticKernel<<<dimGrid,dimBlock>>>();
   cudaThreadSynchronize();
   }
   cudaEventRecord(stop,0);
   cudaEventSynchronize(stop);
   cudaEventElapsedTime(&matrix->time,start,stop);
   error=cudaGetLastError();
   assert(error!=0);
   printf("%d\n",error);
  }

回答1:

Can you add more detail about your program please? The CUDA API routines each return a status code, you should check the status of each API call to catch and decode the first reported error.

One point to check is that you have not called any CUDA API routines before you fork the pthreads. Creating a CUDA context (which is automatic for most, but not all, CUDA API routines) before you fork the threads will cause problems. Check this, and if it's not the problem add more details to your question and check the return value of all API calls.



回答2:

Why are you launching a single block in a Grid? This configuration seems suspicious:

dim3 dimGrid(1,1,1);
dim3 dimBlock(1024,1,1);

Try increasing the grid size and putting less threads in a block. But your main problem is probably about contexts as Tom suggests.