How to load a npz file in a non-lazy way?

2020-07-30 04:31发布

NumPy's load() function returns a lazy file loader, not the actual data, when loading a npz file. How to load a npz file so that the data get loaded in memory?

标签： python-2.7 numpy

2条回答

Fickle 薄情

2楼-- · 2020-07-30 04:44

I think you answered your question in the previous one about speed:

data = np.load(dataset_text_filepath)['texts']

The file contents are now in memory.

The .npz file is a zip archive, with multiple arrays. The reason for making load a 2 step operation is that you might always want to load all arrays at once. It lets you load x without loading y.

You could use a system zip archive tool to extract one or more of the files, and then load that directly. That can be a useful step just to better understand the file structure.

To be any more direct you need to study np.lib.npyio.NpzFile and maybe the gzip module.

0人赞添加讨论(0) 举报

劳资没心，怎么记你

3楼-- · 2020-07-30 04:57

If you want to force the contents of the arrays to be read and decompressed, just assign their contents to variables, e.g.:

data = np.load('/path/to/data.npz', 'r')
a = data['a']
b = data['b']
# etc

If you wanted to keep the exact same syntax as with the lazy loader, you could simply load all of the arrays into a dict, e.g.:

data_dict = dict(data)

So now you could use

data_dict['a']

to refer to a in later parts of your script. Personally I wouldn't keep the dict around, though, since the fact that it holds references to all of the arrays would prevent any individual unused ones from being garbage collected later on in your script.

0人赞添加讨论(0) 举报

How to load a npz file in a non-lazy way?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间