improving speed of Python module import

2019-01-31 03:23发布

The question of how to speed up importing of Python modules has been asked previously (Speeding up the python "import" loader and Python -- Speed Up Imports?) but without specific examples and has not yielded accepted solutions. I will therefore take up the issue again here, but this time with a specific example.

I have a Python script that loads a 3-D image stack from disk, smooths it, and displays it as a movie. I call this script from the system command prompt when I want to quickly view my data. I'm OK with the 700 ms it takes to smooth the data as this is comparable to MATLAB. However, it takes an additional 650 ms to import the modules. So from the user's perspective the Python code runs at half the speed.

This is the series of modules I'm importing:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import scipy.ndimage
import scipy.signal
import sys
import os

Of course, not all modules are equally slow to import. The chief culprits are:

matplotlib.pyplot   [300ms]
numpy               [110ms]
scipy.signal        [200ms]

I have experimented with using from, but this isn't any faster. Since Matplotlib is the main culprit and it's got a reputation for slow screen updates, I looked for alternatives. One is PyQtGraph, but that takes 550 ms to import.

I am aware of one obvious solution, which is to call my function from an interactive Python session rather than the system command prompt. This is fine but it's too MATLAB-like, I'd prefer the elegance of having my function available from the system prompt.

I'm new to Python and I'm not sure how to proceed at this point. Since I'm new, I'd appreciate links on how to implement proposed solutions. Ideally, I'm looking for a simple solution (aren't we all!) because the code needs to be portable between multiple Mac and Linux machines.

4条回答
姐就是有狂的资本
2楼-- · 2019-01-31 04:13

Not an actual answer to the question, but a hint on how to can profile the import speed with Python 3.7 and tuna (a small project of mine):

python3.7 -X importtime -c "import foobar" 2> foobar.log
tuna foobar.log

enter image description here

查看更多
Explosion°爆炸
3楼-- · 2019-01-31 04:18

You can import your modules manually instead, using imp. See documentation here.

For example, import numpy as np could probably be written as

import imp
np = imp.load_module("numpy",None,"/usr/lib/python2.7/dist-packages/numpy",('','',5))

This will spare python from browsing your entire sys.path to find the desired packages.

See also:

Manually importing gtk fails: module not found

查看更多
爷、活的狠高调
4楼-- · 2019-01-31 04:19

1.35 seconds isn't long, but I suppose if you're used to half that for a "quick check" then perhaps it seems so.

Andrea suggests a simple client/server setup, but it seems to me that you could just as easily call a very slight modification of your script and keep it's console window open while you work:

  • Call the script, which does the imports then waits for input
  • Minimize the console window, switch to your work, whatever: *Do work*
  • Select the console again
  • Provide the script with some sort of input
  • Receive the results with no import overhead
  • Switch away from the script again while it happily awaits input

I assume your script is identical every time, ie you don't need to give it image stack location or any particular commands each time (but these are easy to do as well!).

Example RAAC's_Script.py:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import scipy.ndimage
import scipy.signal
import sys
import os

print('********* RAAC\'s Script Now Running *********')

while True: # Loops forever
    # Display a message and wait for user to enter text followed by enter key.
    # In this case, we're not expecting any text at all and if there is any it's ignored
    input('Press Enter to test image stack...')

    '''
    *
    *
    **RAAC's Code Goes Here** (Make sure it's indented/inside the while loop!)
    *
    *
    '''

To end the script, close the console window or press ctrl+c.

I've made this as simple as possible, but it would require very little extra to handle things like quitting nicely, doing slightly different things based on input, etc.

查看更多
我命由我不由天
5楼-- · 2019-01-31 04:25

you could build a simple server/client, the server running continuously making and updating the plot, and the client just communicating the next file to process.

I wrote a simple server/client example based on the basic example from the socket module docs: http://docs.python.org/2/library/socket.html#example

here is server.py:

# expensive imports
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import scipy.ndimage
import scipy.signal
import sys
import os

# Echo server program
import socket

HOST = ''                 # Symbolic name meaning all available interfaces
PORT = 50007              # Arbitrary non-privileged port
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((HOST, PORT))
s.listen(1)
while 1:
    conn, addr = s.accept()
    print 'Connected by', addr
    data = conn.recv(1024)
    if not data: break
    conn.sendall("PLOTTING:" + data)
    # update plot
    conn.close()

and client.py:

# Echo client program
import socket
import sys

HOST = ''    # The remote host
PORT = 50007              # The same port as used by the server
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT))
s.sendall(sys.argv[1])
data = s.recv(1024)
s.close()
print 'Received', repr(data)

you just run the server:

python server.py

which does the imports, then the client just sends via the socket the filename of the new file to plot:

python client.py mytextfile.txt

then the server updates the plot.

On my machine running your imports take 0.6 seconds, while running client.py 0.03 seconds.

查看更多
登录 后发表回答