How do I diagnose a CUDA launch failure due to bei

2019-02-17 16:54发布

I'm getting an out-of-resources error when trying to launch a CUDA kernel (through PyCUDA), and I'm wondering if it's possible to get the system to tell me which resource it is that I'm short on. Obviously the system knows what resource has been exhausted, I just want to query that as well.

I've used the occupancy calculator, and everything seems okay, so either there's a corner case not covered, or I'm using it wrong. I know it's not registers (which seems to be the usual culprit) because I'm using <= 63 and it still fails with a 1x1x1 block and 1x1 grid on a CC 2.1 device.

Thanks for any help. I posted a thread on the NVidia boards:

http://forums.nvidia.com/index.php?showtopic=206261&st=0

But got no responses. If the answer is "you can't ask the system for that information" that would be nice to know too (sort of... ;).

Edit:

The most register usage I've seen has been 63. Edited the above to reflect that.

标签： cuda pycuda

2条回答

Explosion°爆炸

2楼-- · 2019-02-17 17:16

See this answer

CUDA maximum registers per thread: sm_12 vs sm_20

It seems 70 registers is too many registers.

0人赞添加讨论(0) 举报

仙女界的扛把子

3楼-- · 2019-02-17 17:18

I think PyCUDA uses the CUDA driver API, so the following may be what is wrong: CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES can happen if you do not specify enough arguments, or you specify the wrong size for arguments, when using cuLaunch() to launch kernels. Since you are using PyCUDA, it could be pretty easy to mismatch the argument list required for a kernel and the arguments you are actually passing, so you might want to check how you are calling your kernels.

I think that this is a poorly named error code in this situation...

0人赞添加讨论(0) 举报

How do I diagnose a CUDA launch failure due to bei

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间