Is there any way by which I can know the number of free/active SMs? Or atleast to read the voltage/power or temperature values of each SM by which I can know whether its working or not? (in real time while some job is getting executed on the gpu device).
%smid helped me in knowing the Id of each SM. Something similar would be helpful.
Thanks and Regards,
Rakesh
The CUDA Profiling Tools Interface (CUPTI) contains an Events API that enables run time sampling of GPU PM counters. The CUPTI SDK ships as part of the CUDA Toolkit. Documentation on sampling can be found in the section CUPTI Events API \ Sampling Events.
One or more of the following counters will provide you a good idea of SM activity:
- active_cycles: Number of cycles a multiprocessor has at least one active warp.
- active_warps: Accumulated number of active warps per cycle. For every cycle it increments by the number of active warps in the cycle
which can be in the range 0 to {48,64}.