Why can't attribute names be Python keywords?

2020-01-27 08:20发布

问题:

There is a restriction on the syntax of attribute access, in Python (at least in the CPython 2.7.2 implementation):

>>> class C(object): pass
>>> o = C()
>>> o.x = 123  # Works
>>> o.if = 123
    o.if = 123
       ^
SyntaxError: invalid syntax

My question is twofold:

  1. Is there a fundamental reason why using Python keyword attribute names (as in o.if = 123) is forbidden?
  2. Is/where is the above restriction on attribute names documented?

It would make sense to do o.class = …, in one of my programs, and I am a little disappointed to not be able to do it (o.class_ would work, but it does not look as simple).

PS: The problem is obviously that if and class are Python keywords. The question is why using keywords as attribute names would be forbidden (I don't see any ambiguity in the expression o.class = 123), and whether this is documented.

回答1:

Because parser is simpler when keywords are always keywords, and not contextual (e.g. if is a keyword when on the statement level, but just an identifier when inside an expression — for if it'd be double hard because of X if C else Y, and for is used in list comprehensions and generator expressions).

So the code doesn't even get to the point where there's attribute access, it's simply rejected by the parser, just like incorrect indentation (which is why it's a SyntaxError, and not AttributeError or something). It doesn't differentiate whether you use if as an attribute name, a variable name, a function name, or a type name. It can never be an identifier, simply because parser always assigns it "keyword" label and makes it a different token than identifiers.

It's the same in most languages, and language grammar (+ lexer specification) is the documentation for that. Language spec mentions it explicitly. It also doesn't change in Python 3.

Also, just because you can use setattr or __dict__ to make an attribute with a reserved name, doesn't mean you should. Don't force yourself/API user to use getattr instead of natural attribute access. getattr should be reserved for when access to a variable attribute name is needed.



回答2:

Because if is a keyword. You have similar issues with o.while and o.for:

pax> python
>>> class C(object): pass
... 

>>> o = C()

>>> o.not_a_keyword = 123

>>> o.if = 123
  File "<stdin>", line 1
    o.if = 123
       ^
SyntaxError: invalid syntax

>>> o.while = 123
  File "<stdin>", line 1
    o.while = 123
          ^
SyntaxError: invalid syntax

>>> o.for = 123
  File "<stdin>", line 1
    o.for = 123
        ^
SyntaxError: invalid syntax

Other keywords in Python can be obtained with:

>>> import keyword
>>> keyword.kwlist
['and', 'as', 'assert', 'break', 'class', 'continue', 'def',
 'del', 'elif', 'else', 'except', 'exec', 'finally', 'for',
 'from', 'global', 'if', 'import', 'in', 'is', 'lambda',
 'not', 'or', 'pass', 'print', 'raise', 'return', 'try',
 'while', 'with', 'yield']

You should not generally use a keyword as variable name in Python.

I would suggest choosing a more descriptive name, such as iface if it's an interface, or infld for an input field and so forth.

As to your question edit as to why keywords aren't allowed, it simplifies parsers greatly if the lexical elements are context free. Having to treat the lexical token if as a keyword in some places and an identifier in others would introduce complexity that's not really needed if you choose your identifiers more wisely.

For example, the C++ statement:

long int int = char[new - int];

could (with a little difficulty) be evaluated with a complex parser based on where those lexical elements occur (and what exists on either side of them). But, (at least partially) in the interests of simplicity (and readability), this is not done.