I have up to 8 seperate Python processes creating temp files in a shared folder. Then I'd like the controlling process to append all the temp files in a certain order into one big file. What's the quickest way of doing this at an os agnostic shell level?
相关问题
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- How to get the background from multiple images by
- Evil ctypes hack in python
- Correctly parse PDF paragraphs with Python
Rafe's answer was lacking proper open/close statements, e.g.
However, be forewarned that if you want to sort the contents of the bigfile, this method does not catch instances where the last line in one or more of your temp files has a different EOL format, which will cause some strange sort results. In this case, you will want to strip the tempfile lines as you read them, and then write consistent EOL lines to the bigfile (i.e. involving an extra line of code).
Try this. It's very fast (much faster than line-by-line, and shouldn't cause a VM thrash for large files), and should run on about anything, including CPython 2.x, CPython 3.x, Pypy, Pypy3 and Jython. Also it should be highly OS-agnostic. Also, it makes no assumptions about file encodings.
There is one (nontrivial) optimization I've left out: It's better to not assume anything about a good blocksize, instead using a bunch of random ones, and slowly backing off the randomization to focus on the good ones (sometimes called "simulated annealing"). But that's a lot more complexity for little actual performance benefit.
You could also make the os.write keep track of its return value and restart partial writes, but that's only really necessary if you're expecting to receive (nonterminal) *ix signals.