Detect all global variables within a python functi

2020-06-04 05:56发布

问题:

I am trying to analyze some messy code, that happens to use global variables quite heavily within functions (I am trying to refactor the code so that functions only use local variables). Is there any way to detect global variables within a function?

For example:

def f(x):
    x = x + 1
    z = x + y
    return z

Here the global variable is y since it isn't given as an argument, and neither is it created within the function.

I tried to detect global variables within the function using string parsing, but it was getting a bit messy; I was wondering if there was a better way to do this?

Edit: If anyone is interested this is the code I am using to detect global variables (based on kindall's answer and Paolo's answer to this question: Capture stdout from a script in Python):

from dis import dis

def capture(f):
    """
    Decorator to capture standard output
    """
    def captured(*args, **kwargs):
        import sys
        from cStringIO import StringIO

        # setup the environment
        backup = sys.stdout

        try:
            sys.stdout = StringIO()     # capture output
            f(*args, **kwargs)
            out = sys.stdout.getvalue() # release output
        finally:
            sys.stdout.close()  # close the stream 
            sys.stdout = backup # restore original stdout

        return out # captured output wrapped in a string

    return captured

def return_globals(f):
    """
    Prints all of the global variables in function f
    """
    x = dis_(f)
    for i in x.splitlines():
        if "LOAD_GLOBAL" in i:
            print i

dis_ = capture(dis)

dis_(f)

dis by default does not return output, so if you want to manipulate the output of dis as a string, you have to use the capture decorator written by Paolo and posted here: Capture stdout from a script in Python

回答1:

Inspect the bytecode.

from dis import dis
dis(f)

Result:

  2           0 LOAD_FAST                0 (x)
              3 LOAD_CONST               1 (1)
              6 BINARY_ADD
              7 STORE_FAST               0 (x)

  3          10 LOAD_FAST                0 (x)
             13 LOAD_GLOBAL              0 (y)
             16 BINARY_ADD
             17 STORE_FAST               1 (z)

  4          20 LOAD_FAST                1 (z)
             23 RETURN_VALUE

The global variables will have a LOAD_GLOBAL opcode instead of LOAD_FAST. (If the function changes any global variables, there will be STORE_GLOBAL opcodes as well.)

With a little work, you could even write a function that scans the bytecode of a function and returns a list of the global variables it uses. In fact:

from dis import HAVE_ARGUMENT, opmap

def getglobals(func):
    GLOBAL_OPS = opmap["LOAD_GLOBAL"], opmap["STORE_GLOBAL"]
    EXTENDED_ARG = opmap["EXTENDED_ARG"]

    func = getattr(func, "im_func", func)
    code = func.func_code
    names = code.co_names

    op = (ord(c) for c in code.co_code)
    globs = set()
    extarg = 0

    for c in op:
        if c in GLOBAL_OPS:
            globs.add(names[next(op) + next(op) * 256 + extarg])
        elif c == EXTENDED_ARG:
            extarg = (next(op) + next(op) * 256) * 65536
            continue
        elif c >= HAVE_ARGUMENT:
            next(op)
            next(op)

        extarg = 0

    return sorted(globs)

print getglobals(f)               # ['y']


回答2:

As mentioned in the LOAD_GLOBAL documentation:

LOAD_GLOBAL(namei)

Loads the global named co_names[namei] onto the stack.

This means you can inspect the code object for your function to find globals:

>>> f.__code__.co_names
('y',)

Note that this isn't sufficient for nested functions (nor is the dis.dis method in @kindall's answer). In that case, you will need to look at constants too:

# Define a function containing a nested function
>>> def foo():
...    def bar():
...        return some_global

# It doesn't contain LOAD_GLOBAL, so .co_names is empty.
>>> dis.dis(foo)
  2           0 LOAD_CONST               1 (<code object bar at 0x2b70440c84b0, file "<ipython-input-106-77ead3dc3fb7>", line 2>)
              3 MAKE_FUNCTION            0
              6 STORE_FAST               0 (bar)
              9 LOAD_CONST               0 (None)
             12 RETURN_VALUE

# Instead, we need to walk the constants to find nested functions:
# (if bar contain a nested function too, we'd need to recurse)
>>> from types import CodeType
>>> for constant in foo.__code__.co_consts:
...     if isinstance(constant, CodeType):
...         print constant.co_names
('some_global',)