MemoryError using json.dumps()

I would like to know which one of json.dump() or json.dumps() are the most efficient when it comes to encoding a large array to json format.

Can you please show me an example of using json.dump()?

Actually I am making a Python CGI that gets large amount of data from a MySQL database using the ORM SQlAlchemy, and after some user triggered processing, I store the final output in an Array that I finally convert to Json.

But when converting to JSON with :

 print json.dumps({'success': True, 'data': data}) #data is my array

I get the following error:

Traceback (most recent call last):
  File "C:/script/cgi/translate_parameters.py", line 617, in     &lt;module&gt;
f.write(json.dumps(mytab,default=dthandler,indent=4))
  File "C:\Python27\lib\json\__init__.py", line 250, in dumps
    sort_keys=sort_keys, **kw).encode(obj)
  File "C:\Python27\lib\json\encoder.py", line 209, in encode
    chunks = list(chunks)
MemoryError

So, my guess is using json.dump() to convert data by chunks. Any ideas on how to do this?

Or other ideas besides using json.dump()?

标签： python mysql json sqlalchemy out-of-memory

2条回答

萌系小妹纸

2楼-- · 2019-01-25 20:42

The JSON module will allocate the entire JSON string in memory before writing, which is why MemoryError occurs.

To get around this problem, use JSON.Encoder().iterencode():

with open(filepath, 'w') as f:
    for chunk in json.JSONEncoder().iterencode(object_to_encode):
        f.write(chunk)

However note that this will generally take quite a while, since it is writing in many small chunks and not everything at once.

Special case:

I had a Python object which is a list of dicts. Like such:

[
    { "prop": 1, "attr": 2 },
    { "prop": 3, "attr": 4 }
    # ...
]

I could JSON.dumps() individual objects, but the dumping whole list generates a MemoryError To speed up writing, I opened the file and wrote the JSON delimiter manually:

with open(filepath, 'w') as f:
    f.write('[')

    for obj in list_of_dicts[:-1]:
        json.dump(obj, f)
        f.write(',')

    json.dump(list_of_dicts[-1], f)
    f.write(']')

You can probably get away with something like that if you know your JSON object structure beforehand. For a general use, just use JSON.Encoder().iterencode().

0人赞添加讨论(0) 举报

唯我独甜

3楼-- · 2019-01-25 20:51

You can simply replace

f.write(json.dumps(mytab,default=dthandler,indent=4))

json.dump(mytab, f, default=dthandler, indent=4)

This should "stream" the data into the file.

0人赞添加讨论(0) 举报

MemoryError using json.dumps()

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间