Is there a way to use `json.dump` with `gzip`?

2020-04-06 07:09发布

问题:

Here is a great answer about how to use json.dumps to write to a gzip file. What I would like to do is to use the dump method instead to serialize the json directly into a GzipFile object.

Example code:

import gzip, json

data = # a dictionary of data here
with gzip.open(write_file, 'w') as zipfile:
   json.dump(data, zipfile)

The error raised is

TypeError: memoryview: a bytes-like objet is required, not 'str'

I believe this is caused because the gzip write() method wants a bytes object passed to it. Per the documentation,

The json module always produces str objects, not bytes objects. Therefore, fp.write() must support str input.

Is there a way to wrap the json string output as bytes so that GzipFile's write() will handle it? Or is the only way to do this to use json.dumps and encode() the resulting string into a bytes object, as in the other linked answer?

回答1:

The gzip module supports it out of the box: just declare an encoding and it will encode the unicode string to bytes before writing it to the file:

with gzip.open(write_file, 'wt', encoding="ascii") as zipfile:
   json.dump(data, zipfile)

Make sure you specify using text mode ('wt').

As json has encoded any non ascii character, ascii encoding is enough, but you could use any other encoding compatible with ascii for the first 128 code points like Latin1, UTF-8, etc



回答2:

to convert a string to a bytes array you can do something like this

json.dump(bytes(data,"utf-8"), zipfile)