I have the next code which reads from multiple files, parses obtained lines and prints the result:
import os
import re
files=[]
pars=[]
for i in os.listdir('path_to_dir_with_files'):
files.append(i)
for f in files:
with open('path_to_dir_with_files'+str(f), 'r') as a:
pars.append(re.sub('someword=|\,.*|\#.*','',a.read()))
for k in pars:
print k
But I have problem with multiple new lines in output:
test1
test2
Instead of it I want to obtain the next result without empty lines in output:
test1
test2
and so on.
I tried playing with regexp:
pars.append(re.sub('someword=|\,.*|\#.*|^\n$','',a.read()))
But it doesn't work. Also I tried using strip() and rstrip() including replace. It also doesn't work.
Could you please help?
You could use a second regex to replace multiple new lines with a single new line and use strip to get rid of the last new line.
Without changing your code much, one easy way would just be to check if the line is empty before you print it, e.g.:
*** EDIT Since each element in pars is actually the entire content of the file (not just a line), you need to go through an replace any double end lines, easiest to do with re
Note that this doesn't take care of the case where a file ends with a newline and the next one begins with one - if that's a case you are worried about you need to either add extra logic to deal with it or change the way you're reading the data in
Just would like to point out: regexes aren't the best way to handle that. Replacing two empty lines by one in a Python str is quite simple, no need for re:
And voila! Much faster than re and (in my opinion) much easier to read.