How does Python read and interpret source files?

2019-02-28 07:59发布

问题:

Say I run a Python (2.7, though I'm not sure that makes a difference here) script. Instead of terminating the script, I tab out, or somehow switch back to my editing environment. I can then modify the script and save it, but this changes nothing in the still-running script.

Does Python load all source files into memory completely at launch? I am under the impression that this is how the Python interpreter works, but this contradicts my other views of the Python interpreter: I have heard that .pyc files serve as byte-code for Python's virtual machine, like .class files in Java. At the same time however, some (very few in my understanding) implementations of Python also use just-in-time compilation techniques.

So am I correct in thinking that if I make a change to a .py file while my script is running, I don't see that change until I re-run the script, because at launch all necessary .py files are compiled into .pyc files, and simply modifying the .py files does not remake the .pyc files?

If that is correct, then why don't huge programs, like the one I'm working on with ~6,550 kilobytes of source code distributed over 20+ .py files, take forever to compile at startup? How is the program itself so fast?


Additional Info:

  • I am not using third-party modules. All of the files have been written locally. The main source file is relatively small (10 kB), but the source file I primarily work on is 65 kB. It was also written locally and changes every time before launch.

回答1:

Python loads the main script into memory, compiles it into bytecode and runs that. If you modify the source file in the meantime, you're not affecting the bytecode.

If you're running the script as the main script (i. e. by calling it like python myfile.py, then the bytecode will be discarded when the script exits.

If you're importing the script, however, then the bytecode will be written to disk as a .pyc file which won't be recompiled when imported again, unless you modify the corresponding .py file.

Your big 6.5 MB program consists of many modules which are imported by the (probably small) main script, so only that will have to be compiled at each run. All the other files will have their .pyc file ready to run.



回答2:

First of all, you are indeed correct in your understanding that changes to a Python source file aren't seen by the interpreter until the next run. There are some debugging systems, usually built for proprietary purposes, that allow you to reload modules, but this bring attendant complexities such as existing objects retaining references to code from the old module, for example. It can get really ugly, though.

The reason huge programs start up so quickly is the the interpreter tries to create a .pyc file for every .py file it imports if either no corresponding .pyc file exists or if the .py is newer. The .pyc is indeed the program compiled into byte code, so it's relatively quick to load.

As far as JIT compilation goes you may be thinking of the PyPy implementation, which is written in Python and has backends in several different languages. It's increasingly being used in Python 2 shops where execution speed is important, but it's along way from the CPython that we all know and love.