Dynamically change __slots__ in Python 3

2020-02-06 07:33发布

问题:

Suppose I have a class with __slots__

class A:
    __slots__ = ['x']

a = A()
a.x = 1   # works fine
a.y = 1   # AttributeError (as expected)

Now I am going to change __slots__ of A.

A.__slots__.append('y')
print(A.__slots__)   # ['x', 'y']
b = A()
b.x = 1   # OK
b.y = 1   # AttributeError (why?)

b was created after __slots__ of A had changed, so Python, in principle, could allocate memory for b.y. Why it didn't?

How to properly modify __slots__ of a class, so that new instances have the modified attributes?

回答1:

You cannot dynamically alter the __slots__ attribute after creating the class, no. That's because the value is used to create special descriptors for each slot. From the __slots__ documentation:

__slots__ are implemented at the class level by creating descriptors (Implementing Descriptors) for each variable name. As a result, class attributes cannot be used to set default values for instance variables defined by __slots__; otherwise, the class attribute would overwrite the descriptor assignment.

You can see the descriptors in the class __dict__:

>>> class A:
...     __slots__ = ['x']
... 
>>> A.__dict__
mappingproxy({'__module__': '__main__', '__doc__': None, 'x': <member 'x' of 'A' objects>, '__slots__': ['x']})
>>> A.__dict__['x']
<member 'x' of 'A' objects>
>>> a = A()
>>> A.__dict__['x'].__get__(a, A)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: x
>>> A.__dict__['x'].__set__(a, 'foobar')
>>> A.__dict__['x'].__get__(a, A)
'foobar'
>>> a.x
'foobar'

You cannot yourself create these additional descriptors. Even if you could, you cannot allocate more memory space for the extra slot references on the instances produced for this class, as that's information stored in the C struct for the class, and not in a manner accessible to Python code.

That's all because __slots__ is only an extension of the low-level handling of the elements that make up Python instances to Python code; the __dict__ and __weakref__ attributes on regular Python instances were always implemented as slots:

>>> class Regular: pass
... 
>>> Regular.__dict__['__dict__']
<attribute '__dict__' of 'Regular' objects>
>>> Regular.__dict__['__weakref__']
<attribute '__weakref__' of 'Regular' objects>
>>> r = Regular()
>>> Regular.__dict__['__dict__'].__get__(r, Regular) is r.__dict__
True

All the Python developers did here was extend the system to add a few more of such slots using arbitrary names, with those names taken from the __slots__ attribute on the class being created, so that you can save memory; dictionaries take more memory than simple references to values in slots do. By specifying __slots__ you disable the __dict__ and __weakref__ slots, unless you explicitly include those in the __slots__ sequence.

The only way to extend slots then is to subclass; you can dynamically create a subclass with the type() function or by using a factory function:

def extra_slots_subclass(base, *slots):
    class ExtraSlots(base):
        __slots__ = slots
    ExtraSlots.__name__ = base.__name__
    return ExtraSlots


回答2:

It appears to me a type turns __slots__ into a tuple as one of it's first orders of action. It then stores the tuple on the extended type object. Since beneath it all, the python is looking at a tuple, there is no way to mutate it. Indeed, I'm not even sure you can access it unless you pass a tuple in to the instance in the first place.

The fact that the original object that you set still remains as an attribute on the type is (perhaps) just a convenience for introspection.

You can't modify __slots__ and expect to have that show up somewhere (and really -- from a readability perspective, You probably don't really want to do that anyway, right?)...

Of course, you can always subclass to extend the slots:

>>> class C(A):
...   __slots__ = ['z']
... 
>>> c = C()
>>> c.x = 1
>>> c.z = 1


回答3:

You cannot modify the __slots__ attribute after class creation. This is because it would leade to strange behaviour.

Imagine the following.

class A:
    __slots__ = ["x"]

a = A()
A.__slots__.append("y")
a.y = None 

What should happen in this scenario? No space was originally allocated for a second slot, but according to the slots attribute, a should be able have space for y.

__slots__ is not about protecting what names can and cannot be accessed. Rather __slots__ is about reducing the memory footprint of an object. By attempting to modify __slots__ you would defeat the optimisations that __slots__ is meant to achieve.

How __slots__ reduces memory footprint

Normally, an object's attributes are stored in a dict, which requires a fair bit of memory itself. If you are creating millions of objects then the space required by these dicts becomes prohibitive. __slots__ informs the python machinery that makes the class object that there will only be so many attributes refered to by instances of this class and what the names of the attributes will be. Therefore, the class can make an optimisation by storing the attributes directly on the instance rather than in a dict. It places the memory for the (pointers to the) attributes directly on the object, rather than creating a new dict for the object.



回答4:

Putting answers to this and related question together, I want to make an accent on a solution to this problem:

You can kind of modify __slots__ by creating a subclass with the same name and then replacing parent class with its child. Note that you can do this for classes declared and used in any module, not just yours!


Consider the following module which declares some classes:

module.py:
class A(object):
    # some class a user should import
    __slots__ = ('x', 'b')

    def __init__(self):
        self.b = B()

class B(object):
    # let's suppose we can't use it directly,
    # it's returned as a part of another class
    __slots__ = ('z',)

Here's how you can add attributes to these classes:

>>> import module
>>> from module import A
>>>
>>> # for classes imported into your module:
>>> A = type('A', (A,), {'__slots__': ('foo',)})
>>> # for classes which will be instantiated by the `module` itself:
>>> module.B = type('B', (module.B,), {'__slots__': ('bar',)})
>>>
>>> a = A()
>>> a.x = 1
>>> a.foo = 2
>>>
>>> b = a.b
>>> b.z = 3
>>> b.bar = 4
>>>

But what if you receive class instances from some third-party module using the module?

module_3rd_party.py:
from module import A

def get_instance():
    return A()

No problem, it will also work! The only difference is that you may need to patch them before you import third-party module (in case it imports classes from the module):

>>> import module
>>>
>>> module.A = type('A', (module.A,), {'__slots__': ('foo',)})
>>> module.B = type('B', (module.B,), {'__slots__': ('bar',)})
>>>
>>> # note that we import `module_3rd_party` AFTER we patch the `module`
>>> from module_3rd_party import get_instance
>>>
>>> a = get_instance()
>>> a.x = 1
>>> a.foo = 2
>>>
>>> b = a.b
>>> b.z = 3
>>> b.bar = 4
>>>

It works because Python imports modules only once and then shares them between all other modules, so the changes you make to modules affect all code running along yours.