In Raymond Hettinger's talk "Super considered super speak" at PyCon 2015 he explains the advantages of using super
in Python in multiple inheritance context. This is one of the examples that Raymond used during his talk:
class DoughFactory(object):
def get_dough(self):
return 'insecticide treated wheat dough'
class Pizza(DoughFactory):
def order_pizza(self, *toppings):
print('Getting dough')
dough = super().get_dough()
print('Making pie with %s' % dough)
for topping in toppings:
print('Adding: %s' % topping)
class OrganicDoughFactory(DoughFactory):
def get_dough(self):
return 'pure untreated wheat dough'
class OrganicPizza(Pizza, OrganicDoughFactory):
pass
if __name__ == '__main__':
OrganicPizza().order_pizza('Sausage', 'Mushroom')
Somebody in the audience asked Raymond about the difference of using self.get_dough()
instead super().get_dough()
. I didn't understand very well the brief answer of Raymond but I coded the two implementations of this example to see the differences. The output are the same for both cases:
Getting dough
Making pie with pure untreated wheat dough
Adding: Sausage
Adding: Mushroom
If you alter the class order from OrganicPizza(Pizza, OrganicDoughFactory)
to OrganicPizza(OrganicDoughFactory, Pizza)
using self.get_dough()
, you will get this result:
Making pie with pure untreated wheat dough
However if you use super().get_dough()
this is the output:
Making pie with insecticide treated wheat dough
I understand the super()
behavior as Raymond explained. But what is the expected behavior of self
in multiple inheritance scenario?
Just to clarify, there are four cases, based on changing the second line in Pizza.order_pizza
and the definition of OrganicPizza
:
super()
, (Pizza, OrganicDoughFactory)
(original): 'Making pie with pure untreated wheat dough'
self
, (Pizza, OrganicDoughFactory)
: 'Making pie with pure untreated wheat dough'
super()
, (OrganicDoughFactory, Pizza)
: 'Making pie with insecticide treated wheat dough'
self
, (OrganicDoughFactory, Pizza)
: 'Making pie with pure untreated wheat dough'
Case 3. is the one that's surprised you; if we switch the order of inheritance but still use super
, we apparently end up calling the original DoughFactory.get_dough
.
What super
really does is ask "which is next in the MRO (method resolution order)?" So what does OrganicPizza.mro()
look like?
(Pizza, OrganicDoughFactory)
: [<class '__main__.OrganicPizza'>, <class '__main__.Pizza'>, <class '__main__.OrganicDoughFactory'>, <class '__main__.DoughFactory'>, <class 'object'>]
(OrganicDoughFactory, Pizza)
: [<class '__main__.OrganicPizza'>, <class '__main__.OrganicDoughFactory'>, <class '__main__.Pizza'>, <class '__main__.DoughFactory'>, <class 'object'>]
The crucial question here is: which comes after Pizza
? As we're calling super
from inside Pizza
, that is where Python will go to find get_dough
*. For 1. and 2. it's OrganicDoughFactory
, so we get the pure, untreated dough, but for 3. and 4. it's the original, insecticide-treated DoughFactory
.
Why is self
different, then? self
is always the instance, so Python goes looking for get_dough
from the start of the MRO. In both cases, as shown above, OrganicDoughFactory
is earlier in the list than DoughFactory
, which is why the self
versions always get untreated dough; self.get_dough
always resolves to OrganicDoughFactory.get_dough(self)
.
* I think that this is actually clearer in the two-argument form of super
used in Python 2.x, which would be super(Pizza, self).get_dough()
; the first argument is the class to skip (i.e. Python looks in the rest of the MRO after that class).
I'd like to share a few observations on this.
Calling self.get_dough()
may be not possible if you are overriding the get_dough()
method of the parent class, like here:
class AbdullahStore(DoughFactory):
def get_dough(self):
return 'Abdullah`s special ' + super().get_dough()
I think this is a frequent scenario in practice. If we call DoughFactory.get_dough(self)
directly then the behaviour is fixed. A class deriving AbdullahStore
would have to override
the complete method and cannot reuse the 'added value' of AbdullahStore
. On the other hand, if we use super.get_dough(self)
, this has a flavour of a template:
in any class derived from AbdullahStore
, say
class Kebab(AbdullahStore):
def order_kebab(self, sauce):
dough = self.get_dough()
print('Making kebab with %s and %s sauce' % (dough, sauce))
we can 'instantiate' get_dough()
used in AbdullahStore
differently, by intercepting it in MRO like this
class OrganicKebab(Kebab, OrganicDoughFactory):pass
Here's what it does:
Kebab().order_kebab('spicy')
Making kebab with Abdullah`s special insecticide treated wheat dough and spicy sauce
OrganicKebab().order_kebab('spicy')
Making kebab with Abdullah`s special pure untreated wheat dough and spicy sauce
Since OrganicDoughFactory
has a single parent DoughFactory
, I it is guaranteed to be inserted in MRO right before DoughFactory
and thus overrides its methods for all the preceding classes in MRO. It took me some time to understand the C3 linearization algorithm used to construct the MRO.
The problem is that the two rules
children come before parents
parents order is preserved
from this reference https://rhettinger.wordpress.com/2011/05/26/super-considered-super/ do not yet define the ordering unambiguously. In the class hierarchy
D->C->B->A
\ /
--E--
(class A; class B(A); class C(B); class E(A); class D(C,E)) where E will be inserted in MRO? Is it DCBEA or DCEBA? Perhaps before one can confidently answer questions like this, it is not such a good idea to start inserting super
everywhere. I am still not completely sure, but I think C3 linearization, which is unambiguous and will choose the ordering DCBEA in this example, does
allow us to do the interception trick the way we did it, unambiguously.
Now, I suppose you can predict the result of
class KebabNPizza(Kebab, OrganicPizza): pass
KebabNPizza().order_kebab('hot')
which is an improved kebab:
Making kebab with Abdullah`s special pure untreated wheat dough and hot sauce
But it has probably taken you some time to calculate.
When I first looked at the super
docs https://docs.python.org/3.5/library/functions.html?highlight=super#super a weak ago, coming from C++ background, it was like "wow, ok here are the rules, but how this can ever work and not stub you in the back?".
Now I understand more about it, but still reluctant from inserting super
everywhere. I think most of the codebase I have seen is doing it just because super()
is more convenient to type than the base class name.
And this is even not speaking about the extreme use of super()
in chaining the __init__
functions. What I observe in practice is that everybody write constructors with the signature that is convenient for the class (and not the universal one) and use super()
to call what they think is their base class constructor.