Python bytecode function call passing self

2019-07-08 03:50发布

I'm trying to understand how bytecode works.

a.func() is a function call. The corresponding bytecode is roughly LOAD_GLOBAL a, LOAD_ATTR attr and then CALL_FUNCTION with 0 arguments.

This is totally fine if a is a module. But if a is an object, it has to pass the object instance itself. Since Python could NOT know whether a is a module or an object at compile time, naturally the bytecode is same regardless of the type of a. But how does the runtime system handle self as the first argument to func if a is an object? Is there some special handling below bytecode level that says "if it is called on an object prepend the object as the first argument"?

3条回答
男人必须洒脱
2楼-- · 2019-07-08 04:17

The bytecode doesn't have to vary for different object types. It is the responsibility of the object type itself to manage binding behaviour. This is covered in the descriptor protocol.

In short, LOAD_ATTR delegates attribute access to the object, via the object.__getattribute__ hook:

Called unconditionally to implement attribute accesses for instances of the class.

For modules, __getattribute__ simply looks up the name in the __dict__ namespace and returns it. But for classes and metaclasses, the implementation will invoke the descriptor protocol if the attribute supports this. Functions support the descriptor protocol and return a bound method when so asked:

>>> class Foo:
...     def method(self): pass
...
>>> Foo().method  # access on an instance -> binding behaviour
<bound method Foo.method of <__main__.Foo object at 0x107155828>>
>>> Foo.method    # access on the class, functions just return self when bound here
<function Foo.method at 0x1073702f0>
>>> Foo.method.__get__(Foo(), Foo)  # manually bind the function
<bound method Foo.method of <__main__.Foo object at 0x107166da0>>

This binding behaviour also underlies how property, classmethod and staticmethod objects work (the latter neuters the binding behaviour of a function by returning the function itself).

查看更多
Luminary・发光体
3楼-- · 2019-07-08 04:33

In a nutshell, a.func already knows which object it is bound to, and so does not require an explicit self (it already knows what self is):

>>> a.func
<bound method A.func of <__main__.A object at 0x10e810a10>>

Contrast this with A.func (where A is the class):

>>> A.func
<unbound method A.func>

Calling A.func does require an explicit self:

>>> A.func()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unbound method func() must be called with A instance as first argument (got nothing instead)
>>> A.func(a)
>>>

Or, in bytecodes:

          0 LOAD_GLOBAL              0 (A)
          3 LOAD_ATTR                1 (func)
          6 LOAD_GLOBAL              2 (a)
          9 CALL_FUNCTION            1
         12 POP_TOP             

(Note the extra LOAD_GLOBAL.)

The mechanics of bound vs unbound methods is explained in the Python Language Reference (search for im_self or __self__).

查看更多
不美不萌又怎样
4楼-- · 2019-07-08 04:38

LOAD_ATTR does the magic via descriptors ( https://docs.python.org/2/howto/descriptor.html ).

Assuming a is object of class A: In python functions are descriptors. When you do a.func, in reality it returns A.func, which is descriptor object (unbound function). It then "upgrades" itself to bound function (A.func.__get__ is called). Unbound function must be given self argument as first explicitly. Bound function already has self argument remembered "inside" itself.

In python module is an object and uses exactly the same mechanism.

查看更多
登录 后发表回答