I extended dict
in a simple way to directly access it's values with the d.key
notation instead of d['key']
:
class ddict(dict):
def __getattr__(self, item):
return self[item]
def __setattr__(self, key, value):
self[key] = value
Now when I try to pickle it, it will call __getattr__
to find __getstate__
, which is neither present nor necessary. The same will happen upon unpickling with __setstate__
:
>>> import pickle
>>> class ddict(dict):
... def __getattr__(self, item):
... return self[item]
... def __setattr__(self, key, value):
... self[key] = value
...
>>> pickle.dumps(ddict())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in __getattr__
KeyError: '__getstate__'
How do I have to modify the class ddict
in order to be properly pickable?
The problem is not pickle
but that your __getattr__
method breaks the expected contract by raising KeyError
exceptions. You need to fix your __getattr__
method to raise AttributeError
exceptions instead:
def __getattr__(self, item):
try:
return self[item]
except KeyError:
raise AttributeError(item)
Now pickle
is given the expected signal for a missing __getstate__
customisation hook.
From the object.__getattr__
documentation:
This method should return the (computed) attribute value or raise an AttributeError
exception.
(bold emphasis mine).
If you insist on keeping the KeyError
, then at the very least you need to skip names that start and end with double underscores and raise an AttributeError
just for those:
def __getattr__(self, item):
if isinstance(item, str) and item[:2] == item[-2:] == '__':
# skip non-existing dunder method lookups
raise AttributeError(item)
return self[item]
Note that you probably want to give your ddict()
subclass an empty __slots__
tuple; you don't need the extra __dict__
attribute mapping on your instances, since you are diverting attributes to key-value pairs instead. That saves you a nice chunk of memory per instance.
Demo:
>>> import pickle
>>> class ddict(dict):
... __slots__ = ()
... def __getattr__(self, item):
... try:
... return self[item]
... except KeyError:
... raise AttributeError(item)
... def __setattr__(self, key, value):
... self[key] = value
...
>>> pickle.dumps(ddict())
b'\x80\x03c__main__\nddict\nq\x00)\x81q\x01.'
>>> type(pickle.loads(pickle.dumps(ddict())))
<class '__main__.ddict'>
>>> d = ddict()
>>> d.foo = 'bar'
>>> d.foo
'bar'
>>> pickle.loads(pickle.dumps(d))
{'foo': 'bar'}
That pickle
tests for the __getstate__
method on the instance rather than on the class as is the norm for special methods, is a discussion for another day.
First of all, I think you may need to distinguish between instance attribute and class attribute.
In Python official document Chapter 11.1.4 about pickling, it says:
instances of such classes whose dict or the result of calling getstate() is picklable (see section The pickle protocol for details).
Therefore, the error message you're getting is when you try to pickle an instance of the class, but not the class itself - in fact, your class definition will just pickle fine.
Now for pickling an object of your class, the problem is that you need to call the parent class's serialization implementation first to properly set things up. The correct code is:
In [1]: import pickle
In [2]: class ddict(dict):
...:
...: def __getattr__(self, item):
...: super.__getattr__(self, item)
...: return self[item]
...:
...: def __setattr__(self, key, value):
...: super.__setattr__(self, key, value)
...: self[key] = value
...:
In [3]: d = ddict()
In [4]: d.name = "Sam"
In [5]: d
Out[5]: {'name': 'Sam'}
In [6]: pickle.dumps(d)
Out[6]: b'\x80\x03c__main__\nddict\nq\x00)\x81q\x01X\x04\x00\x00\x00nameq\x02X\x03\x00\x00\x00Samq\x03s}q\x04h\x02h\x03sb.'