What is an Expr in Python AST?

2019-02-15 05:04发布

问题:

I'm working on dynamically generating code in Python.

To aid with this, I wrote a helper method which takes in a string of Python code and dumps out the AST. Here's that method:

# I want print treated as a function, not a statement.
import __future__
pfcf = __future__.print_function.compiler_flag

from ast import dump, PyCF_ONLY_AST

def d(s):
    print(dump(compile(s, '<String>', 'exec', pfcf|PyCF_ONLY_AST))

When I run this function on a simple Hello World, it spits out the following (formatted for easier reading):

d("print('Hello World!')")

Module(body=[Expr(value=Call(func=Name(id='print',
                                       ctx=Load()),
                             args=[Str(s='Hello World!')],
                             keywords=[],
                             starargs=None,
                             kwargs=None))])

I was able to dynamically generate this code and run it - everything was great.

Then I tried to dynamically generate

print(len('Hello World!'))

Should be pretty easy - just another function call. Here's what my code dynamically generated:

Module(body=[Expr(value=Call(func=Name(id='print',
                                       ctx=Load()),
                             args=[Expr(value=Call(func=Name(id='len',
                                                             ctx=Load()),
                                                   args=[Str(s='Hello World!')],
                                                   keywords=[],
                                                   starargs=None,
                                                   kwargs=None))],
                             keywords=[],
                             starargs=None,
                             kwargs=None))])

Running it didn't work, though. Instead, I got this message:

TypeError: expected some sort of expr, but got <_ast.Expr object at 0x101812c10>

So I ran my helper method previously mentioned to see what it would output:

d("print(len('Hello World!')")

Module(body=[Expr(value=Call(func=Name(id='print',
                                       ctx=Load()),
                             args=[Call(func=Name(id='len',
                                                  ctx=Load()),
                                        args=[Str(s='Hello World!')],
                                        keywords=[],
                                        starargs=None,
                                        kwargs=None)],
                             keywords=[],
                             starargs=None,
                             kwargs=None))])

The difference between what I'm generating (which doesn't work) and what it generates (which works), is that they passed Call directly to args, whereas I wrapped mine in Expr.

The problem is, in the very first line, I needed to wrap Call in an Expr. I'm confused - why is it sometimes necessary to wrap a Call in an Expr but not other times? Expr seems like it should be just an abstract base class which Call inherits from, but it's required at the top level right under the Module. Why? Is there something subtle I'm missing? What are the rules for when Call needs to be wrapped in an Expr and when it can be used directly?

回答1:

Expr is not the node for an expression per se but rather an expression-statement --- that is, a statement consisting of only an expression. This is not totally obvious because the abstract grammar uses three different identifiers Expr, Expression, and expr, all meaning slightly different things.

The grammar of Statement allows an Expr node as a child, but the grammar of an Expr node doesn't allow another Expr node as a child. In other words, the args value you are referring to is supposed to be a list of things-that-are-expressions, not a list of Expr nodes. See the documentation of the abstract grammar, which includes:

stmt = FunctionDef(identifier name, arguments args, 
                            stmt* body, expr* decorator_list)
          | ClassDef(identifier name, expr* bases, stmt* body, expr* decorator_list)
          #...
          | Expr(expr value)

In other words, a possible statement is Expr(blah), where blah is something matching the grammar of expr. This is the only use of Expr in the grammar, so this is all an Expr can be; an Expr is a possible statement and nothing else. Elsewhere in the grammar:

expr = BoolOp(boolop op, expr* values)
         | BinOp(expr left, operator op, expr right)
         # other stuff notably excluding Expr(...)
         | Call(expr func, expr* args, keyword* keywords,
             expr? starargs, expr? kwargs)

Since the args argument of Call must match expr*, it must be a list of things matching expr. But an Expr node doesn't match expr; the expr grammar matches an expression, not an expression-statement.

Note that if you use the "eval" mode of compile, it will compile an expression, not a statement, so the Expr node will be absent, and the top-level Module node will be replaced by Expression:

>>> print(dump(compile('print("blah")', '<String>', 'eval', pfcf|PyCF_ONLY_AST)))
Expression(body=Call(func=Name(id='print', ctx=Load()), args=[Str(s=u'blah')], keywords=[], starargs=None, kwargs=None))

You can see that the body of an Expression is a single expression (i.e., an expr), so body is not a list but is set directly to the Call node. When you compile in "exec" mode, though, it has to create extra nodes for the module and its statements, and Expr is such a node.



回答2:

Agreeing with what @BreBarn said:

"When an expression, such as a function call, appears as a statement by itself (an expression statement), with its return value not used or stored, it is wrapped in this container."

Since you're using the result of the len function to print, it's not technically an Expression, in the AST sense.

See this for more info: https://greentreesnakes.readthedocs.org/en/latest/nodes.html#expressions