CUDA ptxas warnings (Stack size for entry)

2019-07-07 03:19发布

问题:

I am getting the following warning which I dont understand when compiling CUDA code:

CUDACOMPILE : ptxas warning : Stack size for entry function '_Z24gpu_kernel_get_3d_pointsiPK8RtmPointS1_PKfS3_P10RtmPoint3DPif' cannot be statically determined

The kernel prototype is:

__global__ void gpu_kernel_get_3d_points(int count1, const RtmPoint *pPoints1, const RtmPoint *pPoints2, const float *PL, const float *PR, 
RtmPoint3D *pPoints3D, int *pGlobalCount, float bbox)

All the pointers are pointers to device memory. I dont see why the compiler should have a problem determing the stack size. There are a few local variables in the kernel but not many. Any ideas? Does this warning matter?

回答1:

It seems like your kernel is dynamically allocating memory on the GPU heap using malloc() or the new operator. It may have an adverse effect on your kernel's performance.



回答2:

This warning happens when a function is recursive. Cuda tries to allocate the stack space before the execution which is not a big deal, unless you are using recursion. The problem with it is that the stack size isn't predictible. The depth of the recursion isn't a known value so the memory that the stack will use isn't known. This warning isn't really relevant but if you exceed the GPU stack with your data, you must manually increase the stack size.