For the NVIDIA GEFORCE 940mx GPU, Device Query shows it has 3 Multiprocessor and 128 cores for each MP.
Number of threads per multiprocessor=2048
So, 3*2048=6144.ie. total 6144 threads in GPU.
6144/1024=6 ,ie. total 6 blocks. And warp size is 32.
But from this video https://www.youtube.com/watch?v=kzXjRFL-gjo i found that each GPU has limit on threads, but no limit on Number of blocks.
So i got confused with this. I would like to know
- How many total threads are in my GPU? Can we use all threads for execute a program?
- How many blocks and Grids are there?
It appears the main source of your confusion is mixing up two completely different sets of limits:
The numbers you quote (2048 threads per multiprocessor, three multiprocessors in total = 6144 threads represent the first set of limits. The numbers you show in your screenshot of the
deviceQuery
output:define the limits of a given kernel launch. While they overlap somewhat, you can treat them as more or less separate. For a more thorough discussion of the practicalities of kernel launch parameters and block dimensions, see here.