I would like to know which one of json.dump()
or json.dumps()
are the most efficient when it comes to encoding a large array to json format.
Can you please show me an example of using json.dump()
?
Actually I am making a Python CGI that gets large amount of data from a MySQL database using the ORM SQlAlchemy, and after some user triggered processing, I store the final output in an Array that I finally convert to Json.
But when converting to JSON with :
print json.dumps({'success': True, 'data': data}) #data is my array
I get the following error:
Traceback (most recent call last):
File "C:/script/cgi/translate_parameters.py", line 617, in <module>
f.write(json.dumps(mytab,default=dthandler,indent=4))
File "C:\Python27\lib\json\__init__.py", line 250, in dumps
sort_keys=sort_keys, **kw).encode(obj)
File "C:\Python27\lib\json\encoder.py", line 209, in encode
chunks = list(chunks)
MemoryError
So, my guess is using json.dump()
to convert data by chunks. Any ideas on how to do this?
Or other ideas besides using json.dump()
?
The
JSON
module will allocate the entire JSON string in memory before writing, which is whyMemoryError
occurs.To get around this problem, use
JSON.Encoder().iterencode()
:However note that this will generally take quite a while, since it is writing in many small chunks and not everything at once.
Special case:
I had a Python object which is a list of dicts. Like such:
I could
JSON.dumps()
individual objects, but the dumping whole list generates aMemoryError
To speed up writing, I opened the file and wrote the JSON delimiter manually:You can probably get away with something like that if you know your JSON object structure beforehand. For a general use, just use
JSON.Encoder().iterencode()
.You can simply replace
by
This should "stream" the data into the file.