Is there really a timeout for kernels on nvidia gp

2019-06-24 05:16发布

searching for answers for why my kernels produce strange error messages or "0" only results I found this answer on SO that mentions that there is a timeout of 5s for kernels running on nvidia gpus? I googled for the timout but I could not find confirming sources or more information.

What do you know about it?

Could the timout cause strange behaviour for kernels with a long runtime?

Thanks!

2条回答
We Are One
2楼-- · 2019-06-24 06:07

If you're on Windows Vista or later, the WDDM driver stack will automatically reset the device after about two seconds unless you tweak your TDR timeouts. (Windows can't tell the difference between a GPU running a lengthy kernel and a GPU that's locked up.) Tesla-branded cards running in TCC mode aren't subject to the normal display adapter restrictions and can therefore run longer kernels.

查看更多
三岁会撩人
3楼-- · 2019-06-24 06:09

Further googling brought up this in the CUDA_Toolkit_Release_Notes_Linux.txt (Known Issus):

# Individual GPU program launches are limited to a run time of less than 5 seconds on a GPU with a display attached. Exceeding this time limit usually causes a launch failure reported through the CUDA driver or the CUDA runtime. GPUs without a display attached are not subject to the 5 second runtime restriction. For this reason it is recommended that CUDA be run on a GPU that is NOT attached to a display and does not have the Windows desktop extended onto it. In this case, the system must contain at least one NVIDIA GPU that serves as the primary graphics adapter.

[update] It seems that the official name for this feature is 'watchdog'.

查看更多
登录 后发表回答