Consider:
>>> timeit.timeit('from win32com.client import Dispatch', number=100000)
0.18883283882571789
>>> timeit.timeit('import win32com.client', number=100000)
0.1275979248277963
It takes significantly longer to import only the Dispatch function rather than the entire module, which seems counter intuitive. Could someone explain why the overhead for taking a single function is so bad? Thanks!
That's because:
from win32com.client import Dispatch
is equivalent to:
import win32com.client #import the whole module first
Dispatch = win32com.client.Dispatch #assign the required attributes to global variables
del win32com #remove the reference to module object
But from win32com.client import Dispatch
has its own advantages, for example if you're using win32com.client.Dispatch
multiple times in your code then it's better to assign it to a variable, so that number of lookups can be reduced. Otherwise each call to win32com.client.Dispatch()
will first search search for win32com
and then client
inside win32com
, and finally Dispatch
inside win32com.client
.
Byte-code comparison:
From the byte code it is clear that number of steps required for from os.path import splitext
are greater than the simple import
.
>>> def func1():
from os.path import splitext
...
>>> def func2():
import os.path
...
>>> import dis
>>> dis.dis(func1)
2 0 LOAD_CONST 1 (-1)
3 LOAD_CONST 2 (('splitext',))
6 IMPORT_NAME 0 (os.path)
9 IMPORT_FROM 1 (splitext)
12 STORE_FAST 0 (splitext)
15 POP_TOP
16 LOAD_CONST 0 (None)
19 RETURN_VALUE
>>> dis.dis(func2)
2 0 LOAD_CONST 1 (-1)
3 LOAD_CONST 0 (None)
6 IMPORT_NAME 0 (os.path)
9 STORE_FAST 0 (os)
12 LOAD_CONST 0 (None)
15 RETURN_VALUE
Module caching:
Note that after from os.path import splitext
you can still access the os
module using sys.modules
because python caches the imported modules.
From docs:
Note For efficiency reasons, each module is only imported once per
interpreter session. Therefore, if you change your modules, you must
restart the interpreter – or, if it’s just one module you want to test
interactively, use reload()
, e.g. reload(modulename)
.
Demo:
import sys
from os.path import splitext
try:
print os
except NameError:
print "os not found"
try:
print os.path
except NameError:
print "os.path is not found"
print sys.modules['os']
output:
os not found
os.path is not found
<module 'os' from '/usr/lib/python2.7/os.pyc'>
Timing comparisons:
$ python -m timeit -n 1 'from os.path import splitext'
1 loops, best of 3: 5.01 usec per loop
$ python -m timeit -n 1 'import os.path'
1 loops, best of 3: 4.05 usec per loop
$ python -m timeit -n 1 'from os import path'
1 loops, best of 3: 5.01 usec per loop
$ python -m timeit -n 1 'import os'
1 loops, best of 3: 2.86 usec per loop
The entire module still has to be imported to get the name you want from it...You'll also find that the OS is caching the module so subsequent access to the .pyc
file will be quicker.
The main issue here is that your code isn't timing what you think it is timing. timieit.timeit()
will run the import
statement in a loop, 100000 times, but at most the first iteration will actually perform the import. All other iterations simply look up the module in sys.modules
, look up the name Dispatch
in the module's globals and add this name to the importing module's globals. So it's essentially only dictionary operations, and small variations in the byte code will become visible since there relative influence compared to the very cheap dictionary operations is big.
If, on the other hand, you measure the time it takes to actually import the module, you can't see any difference between the two approaches, since in both cases this time is completely dominated by the actual import, and the differences fiddling around with the name dictionary become negligable. We can force reimports by deleting the module from sys.modules
in each iteration:
In [1]: import sys
In [2]: %timeit from os import path; del sys.modules["os"]
1000 loops, best of 3: 248 us per loop
In [3]: %timeit import os.path; del sys.modules["os"]
1000 loops, best of 3: 248 us per loop
In [4]: %timeit from os import path
1000000 loops, best of 3: 706 ns per loop
In [5]: %timeit import os.path
1000000 loops, best of 3: 444 ns per loop