CUDA apps time out & fail after several seconds -

I've noticed that CUDA applications tend to have a rough maximum run-time of 5-15 seconds before they will fail and exit out. I realize it's ideal to not have CUDA application run that long but assuming that it is the correct choice to use CUDA and due to the amount of sequential work per thread it must run that long, is there any way to extend this amount of time or to get around it?

标签： cuda timeout gpgpu gpu-programming

8条回答

可以哭但决不认输i

2楼-- · 2019-01-08 08:51

Resolve Timeout Detection and Recovery - WINDOWS 7 (32/64 bit)

Create a registry key in Windows to change the TDR settings to a higher amount, so that Windows will allow for a longer delay before TDR process starts.

Open Regedit from Run or DOS.

In Windows 7 navigate to the correct registry key area, to create the new key:

HKEY_LOCAL_MACHINE>SYSTEM>CurrentControlSet>Control>GraphicsDrivers.

There will probably one key in there called DxgKrnlVersion there as a DWord.

Right click and select to create a new key REG_DWORD, and name it TdrDelay. The value assigned to it is the number of seconds before TDR kicks in - it > is currently 2 automatically in Windows (even though the reg. key value doesn't exist >until you create it). Assign it with a new value (I tried 4 seconds), which doubles the time before TDR. Then restart PC. You need to restart the PC before the value will work.

Source from Win7 TDR (Driver Timeout Detection & Recovery) I have also verified this and works fine.

0人赞添加讨论(0) 举报

时光不老，我们不散

3楼-- · 2019-01-08 08:53

On Windows, the graphics driver has a watchdog timer that kills any shader programs that run for more than 5 seconds. Note that the Xorg/XFree86 drivers don't do this, so one possible workaround is to run the CUDA apps on Linux.

AFAIK it is not possible to disable the watchdog timer on Windows. The only way to get around this on Windows is to use a second card that has no displayed screens on it. It doesn't have to be a Tesla but it must have no active screens.

0人赞添加讨论(0) 举报

做自己的国王

4楼-- · 2019-01-08 08:54

The most basic solution is to pick a point in the calculation some percentage of the way through that I am sure the GPU I am working with is able to complete in time, save all the state information and stop, then to start again.

Update: For Linux: Exiting X will allow you to run CUDA applications as long as you want. No Tesla required (A 9600 was used in testing this)

One thing to note, however, is that if X is never entered, the drivers probably won't be loaded, and it won't work.

It also seems that for Linux, simply not having any X displays up at the time will also work, so X does not need to be exited as long as you screen to a non-X full-screen terminal.

0人赞添加讨论(0) 举报

女痞

5楼-- · 2019-01-08 08:59

I'm not a CUDA expert, --- I've been developing with the AMD Stream SDK, which AFAIK is roughly comparable.

You can disable the Windows watchdog timer, but that is highly not recommended, for reasons that should be obvious. To disable it, you need to regedit HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Watchdog\Display\DisableBugCheck, create a REG_DWORD and set it to 1. You may also need to do something in the NVidia control panel. Look for some reference to "VPU Recovery" in the CUDA docs.

Ideally, you should be able to break your kernel operations up into multiple passes over your data to break it up into operations that run in the time limit.

Alternatively, you can divide the problem domain up so that it's computing fewer output pixels per command. I.e., instead of computing 1,000,000 output pixels in one fell swoop, issue 10 commands to the gpu to compute 100,000 each.

The basic unit that has to fit within the time slice is not your entire application, but the execution of a single command buffer. In the AMD Stream SDK, a long sequence of operations can be broken up into multiple time slices by explicitly flushing the command queue with a CtxFlush() call. Perhaps CUDA has something similar?

You should not have to read all of your data back and forth across the PCIX bus on every time slice; you can leave your textures, etc. in gpu local memory; you just have some command buffers complete occasionally, to prove to the OS that you're not stuck in an infinite loop.

Finally, GPUs are fast, so if your application is not able to do useful work in that 5 or 10 seconds, I'd take that as a sign that something is wrong.

[EDIT Mar 2010 to update:] (outdated again, see the updates below for the most recent information) The registry key above is out-of-date. I think that was the key for Windows XP 64-bit. There are new registry keys for Vista and Windows 7. You can find them here: http://www.microsoft.com/whdc/device/display/wddm_timeout.mspx or here: http://msdn.microsoft.com/en-us/library/ee817001.aspx

[EDIT Apr 2015 to update:] This is getting really out of date. The easiest way to disable TDR for Cuda programming, assuming you have the NVIDIA Nsight tools installed, is to open the Nsight Monitor, click on "Nsight Monitor options", and under "General" set "WDDM TDR enabled" to false. This will change the registry setting for you. Close and reboot. Any change to the TDR registry setting won't take effect until you reboot.

[EDIT August 2018 to update:] Although the NVIDIA tools allow disabling the TDR now, the same question is relevant for AMD/OpenCL developers. For those: The current link that documents the TDR settings is at https://docs.microsoft.com/en-us/windows-hardware/drivers/display/tdr-registry-keys

0人赞添加讨论(0) 举报

\"骚年 ilove

6楼-- · 2019-01-08 08:59

The watchdog timer only applies on GPUs with a display attached.

On Windows the timer is part of the WDDM, it is possible to modify the settings (timeout, behaviour on reaching timeout etc.) with some registry keys, see this Microsoft article for more information.

0人赞添加讨论(0) 举报

再贱就再见

7楼-- · 2019-01-08 09:02

It is possible to disable this behavior in Linux. Although the "watchdog" has an obvious purpose, it may cause some very unexpected results when doing extensive computations using shaders / CUDA.

The option can be toggled in your X-configuration (likely /etc/X11/xorg.conf)

Adding: Option "Interactive" "0" to the device section of your GPU does the job.

see CUDA Visual Profiler 'Interactive' X config option?

For details on the config

and

see ftp://download.nvidia.com/XFree86/Linux-x86/270.41.06/README/xconfigoptions.html#Interactive

For a description of the parameter.

0人赞添加讨论(0) 举报

1 2 下一页

CUDA apps time out & fail after several seconds -

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间