Python self and super in multiple inheritance

2020-06-01 07:26发布

问题:

In Raymond Hettinger's talk "Super considered super speak" at PyCon 2015 he explains the advantages of using super in Python in multiple inheritance context. This is one of the examples that Raymond used during his talk:

class DoughFactory(object):
    def get_dough(self):
        return 'insecticide treated wheat dough'


class Pizza(DoughFactory):
    def order_pizza(self, *toppings):
        print('Getting dough')
        dough = super().get_dough()
        print('Making pie with %s' % dough)
        for topping in toppings:
            print('Adding: %s' % topping)


class OrganicDoughFactory(DoughFactory):
    def get_dough(self):
        return 'pure untreated wheat dough'


class OrganicPizza(Pizza, OrganicDoughFactory):
    pass


if __name__ == '__main__':
    OrganicPizza().order_pizza('Sausage', 'Mushroom')

Somebody in the audience asked Raymond about the difference of using self.get_dough() instead super().get_dough(). I didn't understand very well the brief answer of Raymond but I coded the two implementations of this example to see the differences. The output are the same for both cases:

Getting dough
Making pie with pure untreated wheat dough
Adding: Sausage
Adding: Mushroom

If you alter the class order from OrganicPizza(Pizza, OrganicDoughFactory) to OrganicPizza(OrganicDoughFactory, Pizza) using self.get_dough(), you will get this result:

Making pie with pure untreated wheat dough

However if you use super().get_dough() this is the output:

Making pie with insecticide treated wheat dough

I understand the super() behavior as Raymond explained. But what is the expected behavior of self in multiple inheritance scenario?

回答1:

Just to clarify, there are four cases, based on changing the second line in Pizza.order_pizza and the definition of OrganicPizza:

  1. super(), (Pizza, OrganicDoughFactory) (original): 'Making pie with pure untreated wheat dough'
  2. self, (Pizza, OrganicDoughFactory): 'Making pie with pure untreated wheat dough'
  3. super(), (OrganicDoughFactory, Pizza): 'Making pie with insecticide treated wheat dough'
  4. self, (OrganicDoughFactory, Pizza): 'Making pie with pure untreated wheat dough'

Case 3. is the one that's surprised you; if we switch the order of inheritance but still use super, we apparently end up calling the original DoughFactory.get_dough.


What super really does is ask "which is next in the MRO (method resolution order)?" So what does OrganicPizza.mro() look like?

  • (Pizza, OrganicDoughFactory): [<class '__main__.OrganicPizza'>, <class '__main__.Pizza'>, <class '__main__.OrganicDoughFactory'>, <class '__main__.DoughFactory'>, <class 'object'>]
  • (OrganicDoughFactory, Pizza): [<class '__main__.OrganicPizza'>, <class '__main__.OrganicDoughFactory'>, <class '__main__.Pizza'>, <class '__main__.DoughFactory'>, <class 'object'>]

The crucial question here is: which comes after Pizza? As we're calling super from inside Pizza, that is where Python will go to find get_dough*. For 1. and 2. it's OrganicDoughFactory, so we get the pure, untreated dough, but for 3. and 4. it's the original, insecticide-treated DoughFactory.


Why is self different, then? self is always the instance, so Python goes looking for get_dough from the start of the MRO. In both cases, as shown above, OrganicDoughFactory is earlier in the list than DoughFactory, which is why the self versions always get untreated dough; self.get_dough always resolves to OrganicDoughFactory.get_dough(self).


* I think that this is actually clearer in the two-argument form of super used in Python 2.x, which would be super(Pizza, self).get_dough(); the first argument is the class to skip (i.e. Python looks in the rest of the MRO after that class).



回答2:

I'd like to share a few observations on this.

Calling self.get_dough() may be not possible if you are overriding the get_dough() method of the parent class, like here:

class AbdullahStore(DoughFactory):
    def get_dough(self):
        return 'Abdullah`s special ' + super().get_dough()

I think this is a frequent scenario in practice. If we call DoughFactory.get_dough(self) directly then the behaviour is fixed. A class deriving AbdullahStore would have to override the complete method and cannot reuse the 'added value' of AbdullahStore. On the other hand, if we use super.get_dough(self), this has a flavour of a template: in any class derived from AbdullahStore, say

class Kebab(AbdullahStore):
    def order_kebab(self, sauce):
        dough = self.get_dough()
        print('Making kebab with %s and %s sauce' % (dough, sauce))

we can 'instantiate' get_dough() used in AbdullahStore differently, by intercepting it in MRO like this

class OrganicKebab(Kebab, OrganicDoughFactory):pass

Here's what it does:

Kebab().order_kebab('spicy')
Making kebab with Abdullah`s special insecticide treated wheat dough and spicy sauce
OrganicKebab().order_kebab('spicy')
Making kebab with Abdullah`s special pure untreated wheat dough and spicy sauce

Since OrganicDoughFactory has a single parent DoughFactory, I it is guaranteed to be inserted in MRO right before DoughFactory and thus overrides its methods for all the preceding classes in MRO. It took me some time to understand the C3 linearization algorithm used to construct the MRO. The problem is that the two rules

children come before parents
parents order is preserved

from this reference https://rhettinger.wordpress.com/2011/05/26/super-considered-super/ do not yet define the ordering unambiguously. In the class hierarchy

D->C->B->A
 \      /
   --E--

(class A; class B(A); class C(B); class E(A); class D(C,E)) where E will be inserted in MRO? Is it DCBEA or DCEBA? Perhaps before one can confidently answer questions like this, it is not such a good idea to start inserting super everywhere. I am still not completely sure, but I think C3 linearization, which is unambiguous and will choose the ordering DCBEA in this example, does allow us to do the interception trick the way we did it, unambiguously.

Now, I suppose you can predict the result of

class KebabNPizza(Kebab, OrganicPizza): pass
KebabNPizza().order_kebab('hot')

which is an improved kebab:

Making kebab with Abdullah`s special pure untreated wheat dough and hot sauce

But it has probably taken you some time to calculate.

When I first looked at the super docs https://docs.python.org/3.5/library/functions.html?highlight=super#super a weak ago, coming from C++ background, it was like "wow, ok here are the rules, but how this can ever work and not stub you in the back?". Now I understand more about it, but still reluctant from inserting super everywhere. I think most of the codebase I have seen is doing it just because super() is more convenient to type than the base class name. And this is even not speaking about the extreme use of super() in chaining the __init__ functions. What I observe in practice is that everybody write constructors with the signature that is convenient for the class (and not the universal one) and use super() to call what they think is their base class constructor.