Is there a memory efficient and fast way to load b-第2页回答

Is there a memory efficient and fast way to load b

2019-01-02 15:22发布

I have some json files with 500MB. If I use the "trivial" json.load to load its content all at once, it will consume a lot of memory.

Is there a way to read partially the file? If it was a text, line delimited file, I would be able to iterate over the lines. I am looking for analogy to it.

Any suggestions? Thanks

标签： python json large-files

8条回答

长期被迫恋爱

2楼-- · 2019-01-02 16:09

There was a duplicate to this question that had a better answer. See https://stackoverflow.com/a/10382359/1623645, which suggests ijson.

Update:

I tried it out, and ijson is to JSON what SAX is to XML. For instance, you can do this:

import ijson
for prefix, the_type, value in ijson.parse(open(json_file_name)):
    print prefix, the_type, value

where prefix is a dot-separated index in the JSON tree (what happens if your key names have dots in them? I guess that would be bad for Javascript, too...), theType describes a SAX-like event, one of 'null', 'boolean', 'number', 'string', 'map_key', 'start_map', 'end_map', 'start_array', 'end_array', and value is the value of the object or None if the_type is an event like starting/ending a map/array.

The project has some docstrings, but not enough global documentation. I had to dig into ijson/common.py to find what I was looking for.

0人赞添加讨论(0) 举报

零度萤火

3楼-- · 2019-01-02 16:09

Another idea is to try load it into a document-store database like MongoDB. It deals with large blobs of JSON well. Although you might run into the same problem loading the JSON - avoid the problem by loading the files one at a time.

If path works for you, then you can interact with the JSON data via their client and potentially not have to hold the entire blob in memory

http://www.mongodb.org/

0人赞添加讨论(0) 举报

上一页 1 2

Is there a memory efficient and fast way to load b

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间