I want to know the number of CPUs on the local machine using Python. The result should be user/real
as output by time(1)
when called with an optimally scaling userspace-only program.
问题:
回答1:
If you have python with a version >= 2.6 you can simply use
import multiprocessing
multiprocessing.cpu_count()
http://docs.python.org/library/multiprocessing.html#multiprocessing.cpu_count
回答2:
If you're interested into the number of processors available to your current process, you have to check cpuset first. Otherwise (or if cpuset is not in use), multiprocessing.cpu_count()
is the way to go in Python 2.6 and newer. The following method falls back to a couple of alternative methods in older versions of Python:
import os
import re
import subprocess
def available_cpu_count():
""" Number of available virtual or physical CPUs on this system, i.e.
user/real as output by time(1) when called with an optimally scaling
userspace-only program"""
# cpuset
# cpuset may restrict the number of *available* processors
try:
m = re.search(r'(?m)^Cpus_allowed:\s*(.*)$',
open('/proc/self/status').read())
if m:
res = bin(int(m.group(1).replace(',', ''), 16)).count('1')
if res > 0:
return res
except IOError:
pass
# Python 2.6+
try:
import multiprocessing
return multiprocessing.cpu_count()
except (ImportError, NotImplementedError):
pass
# https://github.com/giampaolo/psutil
try:
import psutil
return psutil.cpu_count() # psutil.NUM_CPUS on old versions
except (ImportError, AttributeError):
pass
# POSIX
try:
res = int(os.sysconf('SC_NPROCESSORS_ONLN'))
if res > 0:
return res
except (AttributeError, ValueError):
pass
# Windows
try:
res = int(os.environ['NUMBER_OF_PROCESSORS'])
if res > 0:
return res
except (KeyError, ValueError):
pass
# jython
try:
from java.lang import Runtime
runtime = Runtime.getRuntime()
res = runtime.availableProcessors()
if res > 0:
return res
except ImportError:
pass
# BSD
try:
sysctl = subprocess.Popen(['sysctl', '-n', 'hw.ncpu'],
stdout=subprocess.PIPE)
scStdout = sysctl.communicate()[0]
res = int(scStdout)
if res > 0:
return res
except (OSError, ValueError):
pass
# Linux
try:
res = open('/proc/cpuinfo').read().count('processor\t:')
if res > 0:
return res
except IOError:
pass
# Solaris
try:
pseudoDevices = os.listdir('/devices/pseudo/')
res = 0
for pd in pseudoDevices:
if re.match(r'^cpuid@[0-9]+$', pd):
res += 1
if res > 0:
return res
except OSError:
pass
# Other UNIXes (heuristic)
try:
try:
dmesg = open('/var/run/dmesg.boot').read()
except IOError:
dmesgProcess = subprocess.Popen(['dmesg'], stdout=subprocess.PIPE)
dmesg = dmesgProcess.communicate()[0]
res = 0
while '\ncpu' + str(res) + ':' in dmesg:
res += 1
if res > 0:
return res
except OSError:
pass
raise Exception('Can not determine number of CPUs on this system')
回答3:
Another option is to use the psutil
library, which always turn out useful in these situations:
>>> import psutil
>>> psutil.cpu_count()
2
This should work on any platform supported by psutil
(Unix and Windows).
Note that in some occasions multiprocessing.cpu_count
may raise a NotImplementedError
while psutil
will be able to obtain the number of CPUs. This is simply because psutil
first tries to use the same techniques used by multiprocessing
and, if those fail, it also uses other techniques.
回答4:
In Python 3.4+: os.cpu_count().
multiprocessing.cpu_count()
is implemented in terms of this function but raises NotImplementedError
if os.cpu_count()
returns None
("can't determine number of CPUs").
回答5:
platform independent:
psutil.cpu_count(logical=False)
https://github.com/giampaolo/psutil/blob/master/INSTALL.rst
回答6:
len(os.sched_getaffinity(0))
is what you usually want
https://docs.python.org/3/library/os.html#os.sched_getaffinity
os.sched_getaffinity(0)
(added in Python 3) returns the set of CPUs available considering the sched_setaffinity
Linux system call, which limits which CPUs a process and its children can run on.
0
means to get the value for the current process. The function returns a set()
of allowed CPUs, thus the need for len()
.
multiprocessing.cpu_count()
on the other hand just returns the total number of physical CPUs.
The difference is especially important because certain cluster management systems such as Platform LSF limit job CPU usage with sched_getaffinity
.
Therefore, if you use multiprocessing.cpu_count()
, your script might try to use way more cores than it has available, which may lead to overload and timeouts.
We can see the difference concretely by restricting the affinity with the taskset
utility.
For example, if I restrict Python to just 1 core (core 0) in my 16 core system:
taskset -c 0 ./main.py
with the test script:
main.py
#!/usr/bin/env python3
import multiprocessing
import os
print(multiprocessing.cpu_count())
print(len(os.sched_getaffinity(0)))
then the output is:
16
1
nproc
however does respect the affinity by default and:
taskset -c 0 nproc
outputs:
1
and man nproc
makes that quite explicit:
print the number of processing units available
nproc
has the --all
flag for the less common case that you want to get the physical CPU count:
taskset -c 0 nproc --all
The only downside of this method is that this appears to be UNIX only. I supposed Windows must have a similar affinity API, possibly SetProcessAffinityMask
, so I wonder why it hasn't been ported. But I know nothing about Windows.
Tested in Ubuntu 16.04, Python 3.5.2.
回答7:
multiprocessing.cpu_count()
will return the number of logical CPUs, so if you have a quad-core CPU with hyperthreading, it will return 8
. If you want the number of physical CPUs, use the python bindings to hwloc:
#!/usr/bin/env python
import hwloc
topology = hwloc.Topology()
topology.load()
print topology.get_nbobjs_by_type(hwloc.OBJ_CORE)
hwloc is designed to be portable across OSes and architectures.
回答8:
These give you the hyperthreaded CPU count
multiprocessing.cpu_count()
os.cpu_count()
These give you the virtual machine CPU count
psutil.cpu_count()
numexpr.detect_number_of_cores()
Only matters if you works on VMs.
回答9:
Can't figure out how to add to the code or reply to the message but here's support for jython that you can tack in before you give up:
# jython
try:
from java.lang import Runtime
runtime = Runtime.getRuntime()
res = runtime.availableProcessors()
if res > 0:
return res
except ImportError:
pass
回答10:
You can also use "joblib" for this purpose.
import joblib
print joblib.cpu_count()
This method will give you the number of cpus in the system. joblib needs to be installed though. More information on joblib can be found here https://pythonhosted.org/joblib/parallel.html
Alternatively you can use numexpr package of python. It has lot of simple functions helpful for getting information about the system cpu.
import numexpr as ne
print ne.detect_number_of_cores()
回答11:
Another option if you don't have Python 2.6:
import commands
n = commands.getoutput("grep -c processor /proc/cpuinfo")
回答12:
This may work for those of us who use different os/systems, but want to get the best of all worlds:
import os
workers = os.cpu_count()
if 'sched_getaffinity' in dir(os):
workers = len(os.sched_getaffinity(0))