-->

Why doesn't Python auto escape '\\' in

2020-08-14 10:38发布

问题:

It seems that some escape chars still matter in docstring. For example, if we run python foo.py (Python 2.7.10), it will emit error like ValueError: invalid \x escape.

def f():
    """
    do not deal with '\x0'
    """
    pass

And in effect, it seem the correct docsting should be:

    """
    do not deal with '\\\\x0'
    """

Additionally it also affects import.

For Python 3.4.3+, the error message is:

  File "foo.py", line 4
    """
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 24-25: truncated \xXX escape

I feel it a bit strange since I was thinking it would only affect __doc__ and have no side effect on the module itself.

Why designed to be so? Is it a flaw/bug in Python?

NOTE

I know the meaning of """ and raw literals, however I think python interpreter should be able to treat docstring specially, at least in theory.

回答1:

From PEP 257:

For consistency, always use """triple double quotes""" around docstrings. Use r"""raw triple double quotes""" if you use any backslashes in your docstrings. For Unicode docstrings, use u"""Unicode triple-quoted strings""" .

There are two forms of docstrings: one-liners and multi-line docstrings.


Also from here:

There's no such python type as "raw string" -- there are raw string literals, which are just one syntax approach (out of many) to specify constants (i.e., literals) that are of string types.

So "getting" something "as a raw string" just makes no sense. You can write docstrings as raw string literals (i.e., with the prefix r -- that's exactly what denotes a raw string literal, the specific syntax that identifies such a constant to the python compiler), or else double up any backslashes in them (an alternative way to specify constant strings including backslash characters), but that has nothing to do with "getting" them one way or another.