How to calculate individual thread coordinate indi

2019-08-03 20:04发布

I have a 3 D grid consisting of 3D blocks. I wish to calculate the individual thread indexes of each coordinates every time the kernel is being called. I have these parameters:

dim3 blocks_query(32,32,32);
dim3 threads_query(32,32,32);
kernel<<< blocks_query,threads_query >>>();

Inside the kernel, I wish to calculate the individual values of x,y and z coordinates for instance, x=0,y=0,z=0, x=0,y=0,z=1, x=0,y=0,z=2,....thanks in advance....

标签: cuda
1条回答
ゆ 、 Hurt°
2楼-- · 2019-08-03 21:03

Individual thread indices (x, y, z coordinates) can be calculated inside the kernel as follows:

int x = blockIdx.x * blockDim.x + threadIdx.x;
int y = blockIdx.y * blockDim.y + threadIdx.y;
int z = blockIdx.z * blockDim.z + threadIdx.z;

Keep in mind that the number of threads per block is limited by the GPU. So the block size you have created is invalid.

dim3 threads_query(32,32,32)

It equals to 32768 threads per block which is not supported by any of the current CUDA devices. Currently, maximum 1024 threads per block is supported for GPUs of Compute capability 2.0 and above while maximum 512 threads for older GPUs. You should reduce the block size otherwise the kernel would not launch. Another thing to be noted is that you are creating 3D grid which is supported only on CUDA GPUs of Compute 2.0 and above.

UPDATE

Suppose the dimensions of your 3D data are xDim, yDim and zDim, then a generic grid of thread blocks can be formed as follows:

dim3 threads_query(8,8,8);

dim3 blocks_query;

blocks_query.x = (xDim + threads_query.x - 1)/threads_query.x;
blocks_query.y = (yDim + threads_query.y - 1)/threads_query.y;
blocks_query.z = (zDim + threads_query.z - 1)/threads_query.z;

The above approach will create total number of threads equal to or greater than the total data size. The extra threads may cause invalid memory access. So perform bound checks inside the kernel. You can do this by passing xDim, yDim and zDim as kernel arguments and adding the following line inside the kernel:

if(x>=xDim || y>=yDim || z>=zDim) return;
查看更多
登录 后发表回答