Is there a way to make Python ignore any .pyc files that are present and always interpret all the code (including imported modules) directly? Google hasn't turned up any answers, so I suspect not, but it seemed worth asking just in case.
(Why do I want to do this? I have a large pipeline of Python scripts which are run repeatedly over a cluster of a couple hundred computers. The Python scripts themselves live on a shared NFS filesystem. Somehow, rarely, after having been run hundreds of times over several hours, they will suddenly start crashing with an error about not being able to import a module. Forcing the regeneration of the .pyc file fixes the problem. I want, of course, to fix the underlying causes, but in the meantime we also need the system to continue running, so it seems like ignoring the .pyc files if possible would be a reasonable workaround).
P.S. I'm using Python 2.5, so I can't use -B.
Well, I don't think Python ever interprets code directly if you're loading the code from a file. Even when using the interactive shell, Python will compile the imported module into a .pyc.
That said, you could write a shell script to go ahead and delete all the .pyc files before launching your scripts. That would certainly force a full rebuild before every execution.
You could use the standard Python library's imp module to reimplement
__builtins__.__import__
, which is the hook function called byimport
andfrom
statement. In particular, the imp.load_module function can be used to load a.py
even when the corresponding.pyc
is present. Be sure to study carefully all the docs in the page I've pointed to, plus those for import, as it's kind of a delicate job. The docs themselves suggest using import hooks instead (per PEP 302) but for this particular task I suspect that would be even harder.BTW, likely causes for your observed problems include race conditions between different computers trying to write
.pyc
files at the same time -- NFS locking is notoriously flaky and has always been;-). As long as every Python compiler you're using is at the same version (if not, you're in big trouble anyway;-), I'd rather precompile all of those.py
files into.pyc
and make their directories read-only; the latter seems the simplest approach anyway (rather than hacking__import__
), even if for some reason you can't precompile.In case anyone is using python 2.6 or above with the same question, the simplest thing to do is:
-B
option, so they won't generate .pyc files.From the docs:
If you can't delete all the .pycs, then you could:
1) Run all your python interpreters with the
-B -O
options.This will tell python to look for .pyo files for bytecode instead of .pyc files (
-O
) and tell python not to generate any bytecode files (-B
).The combination of the two options, assuming you haven't used them before, is that Python won't generate any bytecode files and won't look for bytecode files that would have been generated by older runs.
From the docs:
You may find PEP 3147 - PYC Repository Directories to be of great interest from Python 3.2 onwards.
It's not exactly what you asked for, but would removing the existing .pyc files and then not creating any more work for you? In that case, you could use the -B option:
Perhaps you could work around this by, for example, scheduling a job to periodically shut down the scripts and delete the .pyc files.