可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

When inside tracing function, debugging a function call, is it possible to somehow retrieve the calling expression?

I can get calling line number from traceback object but if there are several function calls (possibly to the same function) on that line (eg. as subexpression in a bigger expression) then how could I learn where this call came from? I would be happy even with the offset from start of the source line.

traceback.tb_lasti seems to give more granual context (index of last bytecode tried) -- is it somehow possible to connect a bytecode to its exact source range?

EDIT: Just to clarify -- I need to extract specific (sub)expression (the callsite) from the calling source line.

回答1:

Traceback frames have a line number too:

lineno = traceback.tb_lineno

You can also reach the code object, which will have a name, and a filename:

name = traceback.tb_frame.f_code.co_name
filename = traceback.tb_frame.f_code.co_filename

You can use the filename and line number, plus the frame globals and the linecache module to efficiently turn that into the correct source code line:

linecache.checkcache(filename)
line = linecache.getline(filename, lineno, traceback.tb_frame.f_globals)

This is what the traceback module uses to turn a traceback into a useful piece of information, in any case.

Since bytecode only has a line number associated with it, you cannot directly lead the bytecode back to the precise part of a source code line; you'd have to parse that line yourself to determine what bytecode each part would emit then match that with the bytecode of the code object.

You could do that with the ast module, but you can't do that on a line-by-line basis as you'd need scope context to generate the correct bytecodes for local versus cell versus global name look-ups, for example.

回答2:

Unfortunately, compiled bytecode has lost its column offsets; the bytecode index to line number mapping is contained in the co_lnotab line number table. The dis module is a nice way of looking at the bytecode and interpreting co_lnotab:

>>> dis.dis(compile('a, b, c', '', 'eval'))
  1           0 LOAD_NAME                0 (a)
              3 LOAD_NAME                1 (b)
              6 LOAD_NAME                2 (c)
              9 BUILD_TUPLE              3
             12 RETURN_VALUE        
  ^-- line number

However, there's nothing stopping us from messing with the line number:

>>> a = ast.parse('a, b, c', mode='eval')
>>> for n in ast.walk(a):
...     if hasattr(n, 'col_offset'):
...         n.lineno = n.lineno * 1000 + n.col_offset
>>> dis.dis(compile(a, '', 'eval'))
1000           0 LOAD_NAME                0 (a)

1003           3 LOAD_NAME                1 (b)

1006           6 LOAD_NAME                2 (c)
              9 BUILD_TUPLE              3
             12 RETURN_VALUE

Since compiling code directly should be the same as compiling via ast.parse, and since messing with line numbers shouldn't affect the generated bytecode (other than the co_lnotab), you should be able to:

locate the source file
parse it with ast.parse
munge the line numbers in the ast to include the column offsets
compile the ast
use the tb_lasti to search the munged co_lnotab
convert the munged line number back to (line number, column offset)

回答3:

I know it's necromancy but I posted a similar question yesterday without seeing this one first. So just in case someone is interested, I solved my problem in a different way than the accepted answer by using the inspect and ast modules in Python3. It's still for debugging and educational purpose but it does the trick.

The answer is rather long so here is the link

回答4:

That's how I finally solved the problem: I instrumented each function call in the original program by wrapping it in a call to a helper function together with information about the source location of the original call. Actually I was interested in controlling the evaluation of each subexpression in the program, so I wrapped each subexpression.

More precisely: when I had an expression e in the original program, it became

_after(_before(location_info), e)

in the instrumented program. The helpers were defined like this:

def _before(location_info):
    return location_info

def _after(location_info, value):
    return value

When tracer reported the call to _before, I knew that it's about to evaluate the expression at location represented by location_info (tracing system gives me access to local variables/parameters, that's how I got to know the value of location_info). When tracer reported the call to _after, I knew that the expession indicated by location_info was just evaluated and the value is in value.

I could have written the execution "event handling" right into those helper functions and bypass the tracing system altogether, but I needed it for other reasons as well, so I used those helpers only for triggering a "call" event in tracing system.

The result can be seen here: http://thonny.org