Modifying class __dict__ when shadowed by a proper

2019-04-06 21:53发布

问题:

I am attempting to modify a value in a class __dict__ directly using something like X.__dict__['x'] += 1. It is impossible to do the modification like that because a class __dict__ is actually a mappingproxy object that does not allow direct modification of values. The reason for attempting direct modification or equivalent is that I am trying to hide the class attribute behind a property defined on the metaclass with the same name. Here is an example:

class Meta(type):
    def __new__(cls, name, bases, attrs, **kwargs):
        attrs['x'] = 0
        return super().__new__(cls, name, bases, attrs)
    @property
    def x(cls):
        return cls.__dict__['x']

class Class(metaclass=Meta):
    def __init__(self):
        self.id = __class__.x
        __class__.__dict__['x'] += 1

This is example shows a scheme for creating an auto-incremented ID for each instance of Class. The line __class__.__dict__['x'] += 1 can not be replaced by setattr(__class__, 'x', __class__.x + 1) because x is a property with no setter in Meta. It would just change a TypeError from mappingproxy into an AttributeError from property.

I have tried messing with __prepare__, but that has no effect. The implementation in type already returns a mutable dict for the namespace. The immutable mappingproxy seems to get set in type.__new__, which I don't know how to avoid.

I have also attempted to rebind the entire __dict__ reference to a mutable version, but that failed as well: https://ideone.com/w3HqNf, implying that perhaps the mappingproxy is not created in type.__new__.

How can I modify a class dict value directly, even when shadowed by a metaclass property? While it may be effectively impossible, setattr is able to do it somehow, so I would expect that there is a solution.

My main requirement is to have a class attribute that appears to be read only and does not use additional names anywhere. I am not absolutely hung up on the idea of using a metaclass property with an eponymous class dict entry, but that is usually how I hide read only values in regular instances.

EDIT

I finally figured out where the class __dict__ becomes immutable. It is described in the last paragraph of the "Creating the Class Object" section of the Data Model reference:

When a new class is created by type.__new__, the object provided as the namespace parameter is copied to a new ordered mapping and the original object is discarded. The new copy is wrapped in a read-only proxy, which becomes the __dict__ attribute of the class object.

回答1:

Probably the best way: just pick another name. Call the property x and the dict key '_x', so you can access it the normal way.

Alternative way: add another layer of indirection:

class Meta(type):
    def __new__(cls, name, bases, attrs, **kwargs):
        attrs['x'] = [0]
        return super().__new__(cls, name, bases, attrs)
    @property
    def x(cls):
        return cls.__dict__['x'][0]

class Class(metaclass=Meta):
    def __init__(self):
        self.id = __class__.x
        __class__.__dict__['x'][0] += 1

That way you don't have to modify the actual entry in the class dict.

Super-hacky way that might outright segfault your Python: access the underlying dict through the gc module.

import gc

class Meta(type):
    def __new__(cls, name, bases, attrs, **kwargs):
        attrs['x'] = 0
        return super().__new__(cls, name, bases, attrs)
    @property
    def x(cls):
        return cls.__dict__['x']

class Class(metaclass=Meta):
    def __init__(self):
        self.id = __class__.x
        gc.get_referents(__class__.__dict__)[0]['x'] += 1

This bypasses critical work type.__setattr__ does to maintain internal invariants, particularly in things like CPython's type attribute cache. It is a terrible idea, and I'm only mentioning it so I can put this warning here, because if someone else comes up with it, they might not know that messing with the underlying dict is legitimately dangerous.

It is very easy to end up with dangling references doing this, and I have segfaulted Python quite a few times experimenting with this. Here's one simple case that crashed on Ideone:

import gc

class Foo(object):
    x = []

Foo().x
gc.get_referents(Foo.__dict__)[0]['x'] = []

print(Foo().x)

Output:

*** Error in `python3': double free or corruption (fasttop): 0x000055d69f59b110 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x70bcb)[0x2b32d5977bcb]
/lib/x86_64-linux-gnu/libc.so.6(+0x76f96)[0x2b32d597df96]
/lib/x86_64-linux-gnu/libc.so.6(+0x7778e)[0x2b32d597e78e]
python3(+0x2011f5)[0x55d69f02d1f5]
python3(+0x6be7a)[0x55d69ee97e7a]
python3(PyCFunction_Call+0xd1)[0x55d69efec761]
python3(PyObject_Call+0x47)[0x55d69f035647]
... [it continues like that for a while]

And here's a case with wrong results and no noisy error message to alert you to the fact that something has gone wrong:

import gc

class Foo(object):
    x = 'foo'

print(Foo().x)

gc.get_referents(Foo.__dict__)[0]['x'] = 'bar'

print(Foo().x)

Output:

foo
foo

I make absolutely no guarantees as to any safe way to use this, and even if things happen to work out on one Python version, they may not work on future versions. It can be fun to fiddle with, but it's not something to actually use. Seriously, don't do it. Do you want to explain to your boss that your website went down or your published data analysis will need to be retracted because you took this bad idea and used it?



回答2:

This probably counts as an "additional name" you don't want, but I've implemented this using a dictionary in the metaclass where the keys are the classes. The __next__ method on the metaclass makes the class itself iterable, such that you can just do next() to get the next ID. The dunder method also keeps the method from being available through the instances. The dictionary storing the next id has a name starting with a double underscore, so it's not easily discoverable from any of the classes that use it. The incrementing ID functionality is thus entirely contained in the metaclass.

I tucked the assignment of the id into a __new__ method on a base class, so you don't have to worry about it in __init__. This also allows you to del Meta so all the machinery is a little harder to get to.

class Meta(type):
    __ids = {}

    @property
    def id(cls):
        return __class__.__ids.setdefault(cls, 0)

    def __next__(cls):
        id = __class__.__ids.setdefault(cls, 0)
        __class__.__ids[cls] += 1
        return id

class Base(metaclass=Meta):
    def __new__(cls, *args, **kwargs):
        self = object.__new__(cls)
        self.id = next(cls)
        return self

del Meta

class Class(Base):
    pass

class Brass(Base):
    pass

c0 = Class()
c1 = Class()

b0 = Brass()
b1 = Brass()

assert (b0.id, b1.id, c0.id, c1.id) == (0, 1, 0, 1)
assert (Class.id, Brass.id) == (2, 2)
assert not hasattr(Class, "__ids")
assert not hasattr(Brass, "__ids")

Note that I've used the same name for the attribute on both the class and the object. That way Class.id is the number of instances you've created, while c1.id is the ID of that specific instance.



回答3:

My main requirement is to have a class attribute that appears to be read only and does not use additional names anywhere. I am not absolutely hung up on the idea of using a metaclass property with an eponymous class dict entry, but that is usually how I hide read only values in regular instances.

What you are asking for is a contradiction: If your example worked, then __class__.__dict__['x'] would be an "additional name" for the attribute. So clearly we need a more specific definition of "additional name." But to come up with that definition, we need to know what you are trying to accomplish (NB: The following goals are not mutually exclusive, so you may want to do all of these things):

  • You want to make the value completely untouchable, except within the Class.__init__() method (and the same method of any subclasses): This is unPythonic and quite impossible. If __init__() can modify the value, then so can anyone else. You might be able to accomplish something like this if the modifying code lives in Class.__new__(), which the metaclass dynamically creates in Meta.__new__(), but that's extremely ugly and hard to understand.
  • You want the code that manipulates the value to be "nicely encapsulated": Write a method in the metaclass that increments the private value (or does whatever other modification you need), and provide a read-only metaclass property that accesses it under the public name.
  • You are concerned about a subclass accidentally clashing names with the private name: Prefix the private name with a double underscore to invoke automatic name mangling. While this is usually seen as a bit unPythonic, it is appropriate for cases where name collisions may be less obvious to subclass authors, such as the internal names of a metaclass colliding with the internal names of a regular class instantiated from it.