Easy way of not overwriting file when output is th

2020-07-23 07:00发布

问题:

I have a C++ code generator in Python that generates many source files. Most of the time, only one file changes, but because the generators regenerates all of the files, they are all rebuilt. Is there a way to either get Python to not overwrite the files, or else to get cmak to use a checksum to see what needs to be rebuilt rather than just using the file date?

I was thinking something like this would be easy in Python: If I could replace

with open('blah', 'w') as f:

with this:

with open_but_only_overwrite_if_total_output_is_different('blah', 'w') as f:

What's a nice way of accomplishing that?

回答1:

Combining the code and ideas of Neil G, Petr Viktorin, gecco, and joel3000:

import contextlib
@contextlib.contextmanager
def write_on_change(filename):
    with tempfile.NamedTemporaryFile(delete=False) as f:
        yield f
        tempname = f.name
    try:
        overwrite = not filecmp.cmp(tempname,filename)
    except (OSError,IOError):
        overwrite = True
    if overwrite:
        shutil.copyfile(tempname,filename)
    os.unlink(tempname)

Some little additions (hopefully improvements):

  • shutil.copyfile only copies the contents of tempname into filename, while preserving metadata like file permissions and file ownership.
  • filecmp.cmp checks the size of the files and returns False if the sizes don't match. That could be a nice speedup if the files are large and one has stuff appended to the end. It also reads and compares bufsize = 8*1024 bytes at a time, instead of lines at a time. bufsize will generally be bigger than a line, which would result in fewer reads.


回答2:

I'd suggest you write our own file-like object like this:

  • __enter__: Create a temporary file
  • __exit__: Compare content of temporary file with old file (if exists) If they are not the same, then replace the old file by the temporary file

This article is quite helpful for understanding the with statement: Understanding Python's "with" statement



回答3:

Use filecmp - http://docs.python.org/library/filecmp.html .

Write your new files into a tmp directory, compare against your working directory , and transfer over altered files. Then delete tmp.



回答4:

The easiest way is to do in Python exactly what cmake does: have the generator check if the input is newer than the output, and only generate if it is.

Here is a snippet I used for something similar:

if (os.path.exists(output) and
    os.path.getmtime(source) <= os.path.getmtime(output)):
    print "Generated output %s is up-to-date." % output
    return


标签: python cmake