How to make a class which has __getattr__ properly

2019-07-25 15:05发布

问题:

I extended dict in a simple way to directly access it's values with the d.key notation instead of d['key']:

class ddict(dict):

    def __getattr__(self, item):
        return self[item]

    def __setattr__(self, key, value):
        self[key] = value

Now when I try to pickle it, it will call __getattr__ to find __getstate__, which is neither present nor necessary. The same will happen upon unpickling with __setstate__:

>>> import pickle
>>> class ddict(dict):
...     def __getattr__(self, item):
...         return self[item]
...     def __setattr__(self, key, value):
...         self[key] = value
...
>>> pickle.dumps(ddict())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in __getattr__
KeyError: '__getstate__'

How do I have to modify the class ddict in order to be properly pickable?

回答1:

The problem is not pickle but that your __getattr__ method breaks the expected contract by raising KeyError exceptions. You need to fix your __getattr__ method to raise AttributeError exceptions instead:

def __getattr__(self, item):
    try:
        return self[item]
    except KeyError:
        raise AttributeError(item)

Now pickle is given the expected signal for a missing __getstate__ customisation hook.

From the object.__getattr__ documentation:

This method should return the (computed) attribute value or raise an AttributeError exception.

(bold emphasis mine).

If you insist on keeping the KeyError, then at the very least you need to skip names that start and end with double underscores and raise an AttributeError just for those:

def __getattr__(self, item):
    if isinstance(item, str) and item[:2] == item[-2:] == '__':
        # skip non-existing dunder method lookups
        raise AttributeError(item)
    return self[item]

Note that you probably want to give your ddict() subclass an empty __slots__ tuple; you don't need the extra __dict__ attribute mapping on your instances, since you are diverting attributes to key-value pairs instead. That saves you a nice chunk of memory per instance.

Demo:

>>> import pickle
>>> class ddict(dict):
...     __slots__ = ()
...     def __getattr__(self, item):
...         try:
...             return self[item]
...         except KeyError:
...             raise AttributeError(item)
...     def __setattr__(self, key, value):
...         self[key] = value
...
>>> pickle.dumps(ddict())
b'\x80\x03c__main__\nddict\nq\x00)\x81q\x01.'
>>> type(pickle.loads(pickle.dumps(ddict())))
<class '__main__.ddict'>
>>> d = ddict()
>>> d.foo = 'bar'
>>> d.foo
'bar'
>>> pickle.loads(pickle.dumps(d))
{'foo': 'bar'}

That pickle tests for the __getstate__ method on the instance rather than on the class as is the norm for special methods, is a discussion for another day.



回答2:

First of all, I think you may need to distinguish between instance attribute and class attribute. In Python official document Chapter 11.1.4 about pickling, it says:

instances of such classes whose dict or the result of calling getstate() is picklable (see section The pickle protocol for details).

Therefore, the error message you're getting is when you try to pickle an instance of the class, but not the class itself - in fact, your class definition will just pickle fine.

Now for pickling an object of your class, the problem is that you need to call the parent class's serialization implementation first to properly set things up. The correct code is:

In [1]: import pickle

In [2]: class ddict(dict):
   ...:
   ...:     def __getattr__(self, item):
   ...:         super.__getattr__(self, item)
   ...:         return self[item]
   ...:
   ...:     def __setattr__(self, key, value):
   ...:         super.__setattr__(self, key, value)
   ...:         self[key] = value
   ...:

In [3]: d = ddict()

In [4]: d.name = "Sam"

In [5]: d
Out[5]: {'name': 'Sam'}
In [6]: pickle.dumps(d)
Out[6]: b'\x80\x03c__main__\nddict\nq\x00)\x81q\x01X\x04\x00\x00\x00nameq\x02X\x03\x00\x00\x00Samq\x03s}q\x04h\x02h\x03sb.'


标签: python pickle