How do I check for EOF in Python? I found a bug in my code where the last block of text after the separator isn't added to the return list. Or maybe there's a better way of expressing this function?
Here's my code:
def get_text_blocks(filename):
text_blocks = []
text_block = StringIO.StringIO()
with open(filename, 'r') as f:
for line in f:
text_block.write(line)
print line
if line.startswith('-- -'):
text_blocks.append(text_block.getvalue())
text_block.close()
text_block = StringIO.StringIO()
return text_blocks
You might find it easier to solve this using itertools.groupby.
def get_text_blocks(filename):
import itertools
with open(filename,'r') as f:
groups = itertools.groupby(f, lambda line:line.startswith('-- -'))
return [''.join(lines) for is_separator, lines in groups if not is_separator]
Another alternative is to use a regular expression to match the separators:
def get_text_blocks(filename):
import re
seperator = re.compile('^-- -.*', re.M)
with open(filename,'r') as f:
return re.split(seperator, f.read())
The end-of-file condition holds as soon as the for
statement terminates -- that seems the simplest way to minorly fix this code (you can extract text_block.getvalue()
at the end if you want to check it's not empty before appending it).
This is the standard problem with emitting buffers.
You don't detect EOF -- that's needless. You write the last buffer.
def get_text_blocks(filename):
text_blocks = []
text_block = StringIO.StringIO()
with open(filename, 'r') as f:
for line in f:
text_block.write(line)
print line
if line.startswith('-- -'):
text_blocks.append(text_block.getvalue())
text_block.close()
text_block = StringIO.StringIO()
### At this moment, you are at EOF
if len(text_block) > 0:
text_blocks.append( text_block.getvalue() )
### Now your final block (if any) is appended.
return text_blocks
Why do you need StringIO here?
def get_text_blocks(filename):
text_blocks = [""]
with open(filename, 'r') as f:
for line in f:
if line.startswith('-- -'):
text_blocks.append(line)
else: text_blocks[-1] += line
return text_blocks
EDIT: Fixed the function, other suggestions might be better, just wanted to write a function similar to the original one.
EDIT: Assumed the file starts with "-- -", by adding empty string to the list you can "fix" the IndexError or you could use this one:
def get_text_blocks(filename):
text_blocks = []
with open(filename, 'r') as f:
for line in f:
if line.startswith('-- -'):
text_blocks.append(line)
else:
if len(text_blocks) != 0:
text_blocks[-1] += line
return text_blocks
But both versions look a bit ugly to me, the reg-ex version is much more cleaner.
This is a fast way to see if you have an empty file:
if f.read(1) == '':
print "EOF"
f.close()