python csv reader ignore blank row

2019-01-27 04:58发布

问题:

Im using the pythons csv reader . How can I use the following code in such a way that it ignores blank lines.

import csv
f1 = open ("ted.csv")
oldFile1 = csv.reader(f1, delimiter=',', quotechar='"')
oldList1 = list(oldFile1)
f2 = open ("ted2.csv")
newFile2 = csv.reader(f2, delimiter=',', quotechar='"')
newList2 = list(newFile2)

f1.close()
f2.close()

with open("ted.csv") as f1, open("ted2.csv") as f2, open('foo.csv', 'w') as out:
     r1, r2 = csv.reader(f1), csv.reader(f2)
     st = set((row[0], row[3]) for row in r1)
     wr = csv.writer(out)
     for row in (row for row in r2 if (row[0],row[3]) not in st):
           wr.writerow(row)

回答1:

If your csv files start with a blank line, I think you should be able to skip that line with readline() before creating the csv reader:

with open("ted.csv") as f1, open("ted2.csv") as f2, open('foo.csv', 'w') as out:
    f1.readline()
    f2.readline()
    r1, r2 = csv.reader(f1), csv.reader(f2)


回答2:

If your blanks are always on the first line, then Marius' answer is simplest. If you have n blanks at the beginning or you just want to skip some number of lines you can use itertools.islice().

Skip first N lines

Suppose you want to skip over the first 4 lines (blank lines or not):

from itertools import islice
with open('csv2.csv', 'r') as f1, open('out.csv', 'w') as out:
    filt_f1 = islice(f1, 4, None)
    r1 = csv.reader(filt_f1)
    wr = csv.writer(out)
    for line in r1:
        ...

Blank lines throughout

If you have blank lines scattered throughout your files then you can filter them out with itertools.filterfalse.

import csv
from itertools import filterfalse
from itertools import chain

with open('csv1.csv', 'r') as f1, open('csv2.csv', 'r') as f2, open('out.csv', 'w') as out:
    # create an iterator without lines that start with '\n'
    filt_f1 = filterfalse(lambda line: line.startswith('\n'), f1)
    filt_f2 = filterfalse(lambda line: line.startswith('\n'), f2)

    # csv.reader consumes the filtered iterators
    r1, r2 = csv.reader(filt_f1), csv.reader(filt_f2)
    wr = csv.writer(out)

    # here insert your logic, I just write both to the same file
    for line in chain(r1, r2):
        wr.writerow(line)

Where csv1.csv is:

time,name,location
12345,Jean,Montreal

12346,Peter,Chicago

1234589,Doug,Boston

and csv2.csv (note: not shown here, but csv2.csv has 4 blank lines at the top of the file):

123457,Scott,San Diego

123458,Jen,Miami

123459,Robert,Sacramento

output out.csv does not have blank lines throughout:

time,name,location
12345,Jean,Montreal
12346,Peter,Chicago
1234589,Doug,Boston
123457,Scott,San Diego
123458,Jen,Miami
123459,Robert,Sacramento