I have a multi threaded c++ application that runs on Windows, Mac and a few Linux flavors.
To make a long story short: In order for it to run at maximum efficiency, I have to be able to instantiate a single thread per physical processor/core. Creating more threads than there are physical processors/cores degrades the performance of my program considerably. I can already correctly detect the number of logical processors/cores correctly on all three of these platforms. To be able to detect the number of physical processors/cores correctly I'll have to detect if hyper-treading is supported AND active.
My question therefore is if there is a way to detect whether Hyper Threading is supported and enabled? If so, how exactly.
From gathering ideas and concepts from some of the above ideas, I have come up with this solution. Please critique.
For almost every OS, the standard "Get core count" feature returns the logical core count. But in order to get the physical core count, we must first detect if the CPU has hyper threading or not.
We now have the logical core count, now in order to get the intended results, we first must check if hyper threading is being used or if it's even available.
Because there is not an Intel CPU with hyper threading that will only hyper thread one core (at least not from what I have read). This allows us to find this is a really painless way. If hyper threading is available,the logical processors will be exactly double the physical processors. Otherwise, the operating system will detect a logical processor for every single core. Meaning the logical and the physical core count will be identical.
This is very easy to do in Python:
Maybe you could look at the
psutil
source code to see what is going on?I don't know that all three expose the information in the same way, but if you can safely assume that the NT kernel will report device information according to the POSIX standard (which NT supposedly has support for), then you could work off that standard.
However, differing of device management is often cited as one of the stumbling blocks to cross platform development. I would at best implement this as three strands of logic, I wouldn't try to write one piece of code to handle all platforms evenly.
Ok, all that's assuming C++. For ASM, I presume you'll only be running on x86 or amd64 CPUs? You'll still need two branch paths, one for each architecture, and you'll need to test Intel separate from AMD (IIRC) but by and large you just check for the CPUID. Is that what you're trying to find? The CPUID from ASM on Intel/AMD family CPUs?
On OS X, you can read these values from
sysctl(3)
(the C API, or the command line utility of the same name). The man page should give you usage information. The following keys may be of interest:OpenMP should do the trick:
most compilers support OpenMP. If you are using a gcc-based compiler (*nix, MacOS), you need to compile using:
(you might also need to tell your compiler to use the stdc++ library):
As far as I know OpenMP was designed to solve this kind of problems.
Windows only solution desribed here:
GetLogicalProcessorInformation
for linux, /proc/cpuinfo file. I am not running linux now so can't give you more detail. You can count physical/logical processor instances. If logical count is twice as physical, then you have HT enabled (true only for x86).