I'm trying to understand how bytecode works.
a.func()
is a function call. The corresponding bytecode is roughly LOAD_GLOBAL a
, LOAD_ATTR attr
and then CALL_FUNCTION
with 0 arguments.
This is totally fine if a
is a module. But if a
is an object, it has to pass the object instance itself. Since Python could NOT know whether a
is a module or an object at compile time, naturally the bytecode is same regardless of the type of a
. But how does the runtime system handle self
as the first argument to func
if a
is an object? Is there some special handling below bytecode level that says "if it is called on an object prepend the object as the first argument"?
The bytecode doesn't have to vary for different object types. It is the responsibility of the object type itself to manage binding behaviour. This is covered in the descriptor protocol.
In short,
LOAD_ATTR
delegates attribute access to the object, via theobject.__getattribute__
hook:For modules,
__getattribute__
simply looks up the name in the__dict__
namespace and returns it. But for classes and metaclasses, the implementation will invoke the descriptor protocol if the attribute supports this. Functions support the descriptor protocol and return a bound method when so asked:This binding behaviour also underlies how
property
,classmethod
andstaticmethod
objects work (the latter neuters the binding behaviour of a function by returning the function itself).In a nutshell,
a.func
already knows which object it is bound to, and so does not require an explicitself
(it already knows whatself
is):Contrast this with
A.func
(whereA
is the class):Calling
A.func
does require an explicitself
:Or, in bytecodes:
(Note the extra
LOAD_GLOBAL
.)The mechanics of bound vs unbound methods is explained in the Python Language Reference (search for
im_self
or__self__
).LOAD_ATTR does the magic via descriptors ( https://docs.python.org/2/howto/descriptor.html ).
Assuming a is object of class A: In python functions are descriptors. When you do a.func, in reality it returns
A.func
, which is descriptor object (unbound function). It then "upgrades" itself to bound function (A.func.__get__
is called). Unbound function must be given self argument as first explicitly. Bound function already has self argument remembered "inside" itself.In python module is an object and uses exactly the same mechanism.