I use the cmake gui tool to configure my cuda project in vs2013. CMakeLists.txt is as below:
project(CUDA_PART)
# required cmake version
cmake_minimum_required(VERSION 3.0)
include_directories(${CUDA_PART_SOURCE_DIR}/common)
# packages
find_package(CUDA REQUIRED)
# nvcc flags
set(CUDA_NVCC_FLAGS -gencode arch=compute_20,code=sm_20;-G;-g)
set(CUDA_VERBOSE_BUILD ON)
#FILE(GLOB SOURCES "*.cu" "*.cpp" "*.c" "*.h")
CUDA_ADD_EXECUTABLE(CUDA_PART hist_gpu_shmem_atomics.cu)
The .cu file is from Cuda by example source code hist_gpu_shmem_atomics.cu
There are two problems:
After the line
histo_kernel <<<blocks * 2, 256 >>>(dev_buffer, SIZE, dev_histo);
an "invalid device function" error occurs.When I use the CUDA debugging tool to debug, its cannot trigger breakpoints in the device code.
But when I create a project with the same code by the cuda project temple in visual studio 2013.It works correctly!
So, is there something wrong in the CMakeLists.txt ?
OS: Win7 64bit;GPU: GTX960;CUDA: CUDA 7.5;VS: 2013 (and also 2010)
When I use set the "Code Generation" in vs2013 as follow :
The CUDA_NVCC_FLAGES turns out to be -gencode=arch=compute_20,code=\"sm_20,compute_20\"
It equals to:
-gencode=arch=compute_20,code=sm_20 \
-gencode=arch=compute_20,code=compute_20
So, I guess it will generate 2 versions machine code: the first one(SASS) with virtual and real architectures and the second one(PTX) with only virtual architecture. Since my GTX960 is a cc5.2 device, it chooses the second one (PTX) and convert it to a suitable SASS.