I am trying to understand the trade offs/differences between these to
ways of opening files for line-by-line processing
with open('data.txt') as inf:
for line in inf:
#etc
vs
for line in open('data.txt'):
# etc
I understand that using with
ensures the file is closed when the
"with-block" (suite?) is exited (or an exception is countered). So I have been using with
ever since I learned about it here.
Re for
-loop: From searching around the net and SO, it seems that whether the file
is closed when the for
-loop is exited is implementation dependent? And
I couldn't find anything about how this construct would deal with
exceptions. Does anyone know?
If I am mistaken about anything above, I'd appreciate corrections,
otherwise is there a reason to ever use the for
construct over the
with
? (Assuming you have a choice, i.e., aren't limited by Python version)
The problem with this
for line in open('data.txt'):
# etc
Is that you don't keep an explicit reference to the open file, so how do you close it?
The lazy way is wait for the garbage collector to clean it up, but that may mean that the resources aren't freed in a timely manner.
So you can say
inf = open('data.txt')
for line in inf:
# etc
inf.close()
Now what happens if there is an exception while you are inside the for loop? The file won't get closed explicitly.
Add a try/finally
inf = open('data.txt')
try:
for line in inf:
# etc
finally:
inf.close()
This is a lot of code to do something pretty simple, so Python added with
to enable this code to be written in a more readable way. Which gets us to here
with open('data.txt') as inf:
for line in inf:
#etc
So, that is the preferred way to open the file. If your Python is too old for the with statement, you should use the try/finally
version for production code
The with statement was only introduced in Python 2.5 - only if you have backward compatibility requirements for earlier versions should you use the latter.
Bit more clarity
The with statement was introduced (as you're aware) to encompass the try/except/finally system - which isn't terrific to understand, but okay. In Python (the Python in C), the implementation of it will close open files. The specification of the language itself, doesn't say... so IPython, JPython etc... may choose to keep files open, memory open, whatever, and not free resources until the next GC cycle (or at all, but the CPython GC is different from the .NET or Java ones...).
I think the only thing I've heard against it, is that it adds another indentation level.
So to summarise: won't work < 2.5, introduces the 'as' keyword and adds an indentation level.
Otherwise, you stay in control of handling exceptions as normal, and the finally block closes resources if something escapes.
Works for me!
import os
path = "c:\\fio"
longer_path = "c:\\fio\\"
# Read every file in directory
for filename in os.listdir(path):
print()
print("Here is the file name",filename)
inf = open(longer_path+filename)
try:
for line in inf:
print(line,end='')
finally:
inf.close()
#output
Here is the file name a.txt
mouse
apple
Here is the file name New Text Document - Copy.txt
cat
Here is the file name New Text Document.txt
dog