Storing a dict with np.savez gives unexpected resu

2019-02-12 09:44发布

问题:

Can I store a dictionary using np.savez? The results are surprising (to me at least) and I cannot find a way to get my data back by key.

In [1]: a = {'0': {'A': array([1,2,3]), 'B': array([4,5,6])}}
In [2]: a
Out[2]: {'0': {'A': array([1, 2, 3]), 'B': array([4, 5, 6])}}

In [3]: np.savez('model.npz', **a)
In [4]: a = np.load('model.npz')
In [5]: a
Out[5]: <numpy.lib.npyio.NpzFile at 0x7fc9f8acaad0>

In [6]: a['0']
Out[6]: array({'B': array([4, 5, 6]), 'A': array([1, 2, 3])}, dtype=object)

In [7]: a['0']['B']
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-c916b98771c9> in <module>()
----> 1 a['0']['B']

ValueError: field named B not found

In [8]: dict(a['0'])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-17-d06b11e8a048> in <module>()
----> 1 dict(a['0'])

TypeError: iteration over a 0-d array

I do not understand exactly what is going on. It seems that my data becomes a dictionary inside a 0-dimensional array, leaving me with no way to get my data back by key. Or am I missing something?

So my questions are:

  1. What happens here? If I can still access my data by key, how?
  2. What is the best way to store data of this type? (a dict with str as key and other dicts as value)

Thanks!

回答1:

It is possible to recover the data:

In [41]: a = {'0': {'A': array([1,2,3]), 'B': array([4,5,6])}}

In [42]: np.savez('/tmp/model.npz', **a)

In [43]: a = np.load('/tmp/model.npz')

Notice that the dtype is 'object'.

In [44]: a['0']
Out[44]: array({'A': array([1, 2, 3]), 'B': array([4, 5, 6])}, dtype=object)

And there is only one item in the array. That item is a Python dict!

In [45]: a['0'].size
Out[45]: 1

You can retrieve the value using the item() method (NB: this is not the items() method for dictionaries, nor anything intrinsic to the NpzFile class, but is the numpy.ndarray.item() method that copies the value in the array to a standard Python scalars. In an array of object dtype any value held in a cell of the array (even a dictionary) is a Python scalar:

In [46]: a['0'].item()
Out[46]: {'A': array([1, 2, 3]), 'B': array([4, 5, 6])}

In [47]: a['0'].item()['A']
Out[47]: array([1, 2, 3])

In [48]: a['0'].item()['B']
Out[48]: array([4, 5, 6])

To restore a as a dict of dicts:

In [84]: a = np.load('/tmp/model.npz')

In [85]: a = {key:a[key].item() for key in a}

In [86]: a['0']['A']
Out[86]: array([1, 2, 3])


回答2:

Based on this answer: recover dict from 0-d numpy array

After

a = {'key': 'val'}
scipy.savez('file.npz', a=a) # note the use of a keyword for ease later

you can use

get = scipy.load('file.npz')
a = get['a'][()] # this is crazy maybe, but true
print a['key']

It would also work without the use of a keyword argument, but I thought this was worth sharing too.