I have a C++ code generator in Python that generates many source files. Most of the time, only one file changes, but because the generators regenerates all of the files, they are all rebuilt. Is there a way to either get Python to not overwrite the files, or else to get cmak to use a checksum to see what needs to be rebuilt rather than just using the file date?
I was thinking something like this would be easy in Python: If I could replace
with open('blah', 'w') as f:
with this:
with open_but_only_overwrite_if_total_output_is_different('blah', 'w') as f:
What's a nice way of accomplishing that?
Combining the code and ideas of Neil G, Petr Viktorin, gecco, and joel3000:
import contextlib
@contextlib.contextmanager
def write_on_change(filename):
with tempfile.NamedTemporaryFile(delete=False) as f:
yield f
tempname = f.name
try:
overwrite = not filecmp.cmp(tempname,filename)
except (OSError,IOError):
overwrite = True
if overwrite:
shutil.copyfile(tempname,filename)
os.unlink(tempname)
Some little additions (hopefully improvements):
shutil.copyfile
only copies the contents of tempname
into
filename
, while preserving metadata like file permissions and file
ownership.
filecmp.cmp
checks the size of the files
and returns False
if the sizes don't match. That could be a nice
speedup if the files are large and one has stuff appended to the
end. It also reads and compares bufsize = 8*1024 bytes at a time,
instead of lines at a time. bufsize
will generally be bigger than a
line, which would result in fewer reads.
I'd suggest you write our own file-like object like this:
__enter__
: Create a temporary file
__exit__
: Compare content of temporary file with old file (if exists) If they are not the same, then replace the old file by the temporary file
This article is quite helpful for understanding the with
statement: Understanding Python's "with" statement
Use filecmp - http://docs.python.org/library/filecmp.html .
Write your new files into a tmp directory, compare against your working directory
, and transfer over altered files. Then delete tmp.
The easiest way is to do in Python exactly what cmake does: have the generator check if the input is newer than the output, and only generate if it is.
Here is a snippet I used for something similar:
if (os.path.exists(output) and
os.path.getmtime(source) <= os.path.getmtime(output)):
print "Generated output %s is up-to-date." % output
return