From Nvidia's website, it explain the time-out problem:
Q: What is the maximum kernel execution time? On Windows, individual GPU program launches have a maximum run time of around 5 seconds. Exceeding this time limit usually will cause a launch failure reported through the CUDA driver or the CUDA runtime, but in some cases can hang the entire machine, requiring a hard reset. This is caused by the Windows "watchdog" timer that causes programs using the primary graphics adapter to time out if they run longer than the maximum allowed time.
For this reason it is recommended that CUDA is run on a GPU that is NOT attached to a display and does not have the Windows desktop extended onto it. In this case, the system must contain at least one NVIDIA GPU that serves as the primary graphics adapter.
Source: https://developer.nvidia.com/cuda-faq
So it seems that, nvidia believes, or at least strongly implys, having multi- (nvidia) gpus, and with proper configuration, can prevent this from happening?
But how? so far I tried lots ways but there is still the annoying time-out on a GK110 GPU that is: (1) plugging in the secondary PCIE 16X slots; (2) Not being connected to any monitors (3) Is setted to use as an exclusive physX card in driver control panel (as recommended by some other guys), but the block-out is still there.