I have a list of 20 file names, like ['file1.txt', 'file2.txt', ...]
. I want to write a Python script to concatenate these files into a new file. I could open each file by f = open(...)
, read line by line by calling f.readline()
, and write each line into that new file. It doesn't seem very "elegant" to me, especially the part where I have to read//write line by line.
Is there a more "elegant" way to do this in Python?
If the files are not gigantic:
If the files are too big to be entirely read and held in RAM, the algorithm must be a little different to read each file to be copied in a loop by chunks of fixed length, using
read(10000)
for example.That's exactly what fileinput is for:
For this use case, it's really not much simpler than just iterating over the files manually, but in other cases, having a single iterator that iterates over all of the files as if they were a single file is very handy. (Also, the fact that
fileinput
closes each file as soon as it's done means there's no need towith
orclose
each one, but that's just a one-line savings, not that big of a deal.)There are some other nifty features in
fileinput
, like the ability to do in-place modifications of files just by filtering each line.As noted in the comments, and discussed in another post,
fileinput
for Python 2.7 will not work as indicated. Here slight modification to make the code Python 2.7 compliantI don't know about elegance, but this works:
A simple benchmark shows that the shutil performs better.
This should do it
For large files:
For small files:
… and another interesting one that I thought of:
Sadly, this last method leaves a few open file descriptors, which the GC should take care of anyway. I just thought it was interesting