Using csvreader against a gzipped file in Python

2019-03-12 02:27发布

I have a bunch of gzipped CSV files that I'd like to open for inspection using Python's built in CSV reader. I'd like to do this without having first to manually unzip them to disk. I guess I want to somehow get a stream to the uncompressed data, and pass this into the CSV reader. Is this possible in Python?

标签: python csv gzip
3条回答
欢心
2楼-- · 2019-03-12 02:37

Use the gzip module:

with gzip.open(filename) as f:
    reader = csv.reader(f)
    #...
查看更多
兄弟一词,经得起流年.
3楼-- · 2019-03-12 02:40

a more complete solution:

import csv, gzip
class GZipCSVReader:
    def __init__(self, filename):
        self.gzfile = gzip.open(filename)
        self.reader = csv.DictReader(self.gzfile)
    def next(self):
        return self.reader.next()
    def close(self):
        self.gzfile.close()
    def __iter__(self):
        return self.reader.__iter__()

now you can use it like this:

r = GZipCSVReader('my.csv')
for map in r:
    for k,v in map:
        print k,v
r.close()

EDIT: following the below comment, how about a simpler approach:

def gzipped_csv(filename):
    with gzip.open(filename) as f:
        r = csv.DictReader(f)
        for row in r:
            yield row

which let's you then

for row in gzipped_csv(filename):
    for k, v in row:
        print(k, v)
查看更多
爷的心禁止访问
4楼-- · 2019-03-12 02:41

I've tried the above version for writing and reading and it didn't work in Python 3.3 due to "bytes" error. However, after some trial and error I could get the following to work. Maybe it also helps others:

import csv
import gzip
import io


with gzip.open("test.gz", "w") as file:
    writer = csv.writer(io.TextIOWrapper(file, newline="", write_through=True))
    writer.writerow([1, 2, 3])
    writer.writerow([4, 5, 6])

with gzip.open("test.gz", "r") as file:
    reader = csv.reader(io.TextIOWrapper(file, newline=""))
    print(list(reader))

As amohr suggests, the following works as well:

import gzip, csv

with gzip.open("test.gz", "wt", newline="") as file:
    writer = csv.writer(file)
    writer.writerow([1, 2, 3])
    writer.writerow([4, 5, 6])

with gzip.open("test.gz", "rt", newline="") as file:
    reader = csv.reader(file)
    print(list(reader))
查看更多
登录 后发表回答