Processing CSV files with csv.DictReader is great - but I have CSV files with comment lines in (indicated by a hash at the start of a line), for example:
# step size=1.61853
val0,val1,val2,hybridisation,temp,smattr
0.206895,0.797923,0.202077,0.631199,0.368801,0.311052,0.688948,0.597237,0.402763
-169.32,1,1.61853,2.04069e-92,1,0.000906546,0.999093,0.241356,0.758644,0.202382
# adaptation finished
The csv module doesn't include any way to skip such lines.
I could easily do something hacky, but I imagine there's a nice way to wrap a csv.DicReader around some other iterator object, which preprocesses to discard the lines.
Actually this works nicely with filter
:
import csv
fp = open('samples.csv')
rdr = csv.DictReader(filter(lambda row: row[0]!='#', fp))
for row in rdr:
print(row)
fp.close()
Good question, and a good example of how Python's CSV library lacks important functionality, such as handling basic comments (not uncommon at the top of CSV files). While Dan Stowell's solution works for the specific case of the OP, it is limited in that #
must appear as the first symbol. A more generic solution would be:
def decomment(csvfile):
for row in csvfile:
raw = row.split('#')[0].strip()
if raw: yield raw
with open('dummy.csv') as csvfile:
reader = csv.reader(decomment(csvfile))
for row in reader:
print(row)
As an example, the following dummy.csv
file:
# comment
# comment
a,b,c # comment
1,2,3
10,20,30
# comment
returns
['a', 'b', 'c']
['1', '2', '3']
['10', '20', '30']
Of course, this works just as well with csv.DictReader()
.