Is there any way by which I can know the number of free/active SMs? Or atleast to read the voltage/power or temperature values of each SM by which I can know whether its working or not? (in real time while some job is getting executed on the gpu device).
%smid helped me in knowing the Id of each SM. Something similar would be helpful.
Thanks and Regards, Rakesh
The CUDA Profiling Tools Interface (CUPTI) contains an Events API that enables run time sampling of GPU PM counters. The CUPTI SDK ships as part of the CUDA Toolkit. Documentation on sampling can be found in the section CUPTI Events API \ Sampling Events.
One or more of the following counters will provide you a good idea of SM activity: