Python Literal r'\' Not Accepted

2020-04-02 00:04发布

r'\' in Python does not work as expected. Instead of returning a string with one character (a backslash) in it, it raises a SyntaxError. r"\" does the same.

This is rather cumbersome if you have a list of Windows paths like these:

paths = [ r'\bla\foo\bar',
          r'\bla\foo\bloh',
          r'\buff',
          r'\',
          # ...
        ]

Is there a good reason why this literal is not accepted?

5条回答
家丑人穷心不美
2楼-- · 2020-04-02 00:17

To address your root problem, you can use / in paths on Windows in Python just fine.

The r'' and r"" syntax ( raw ) is primarily for working with regular expressions. It doesn't really get you anything in the case of working with paths like you are expecting, especially where the string ends with a \.

Otherwise if you insist on using \ either use '\\' or "\\", you have to escape the escape character which is \; it isn't pretty, using / or os.path.sep is the best solution.

查看更多
男人必须洒脱
3楼-- · 2020-04-02 00:24

The answer to my question ("Why is a backslash not allowed as last character in raw strings?") actually to me seems to be "That's a design decision", furthermore a questionable one.

Some answers tried to reason that the lexer and some syntax highlighters are simpler this way. I don't agree (and I have some background on writing parsers and compiler as well as IDE development). It would be simpler to define raw strings with the semantics that a backslash has no special meaning whatsoever. Both lexer and IDE would benefit from this simplification.

The current situation also is a wart: In case I want a quote in a raw string, I cannot use this anyway. I only can use it if I happen to want a backslash followed by a quote inside my raw string.

I would propose to change this, but I also see the problem of breaking existing code :-/

查看更多
Juvenile、少年°
4楼-- · 2020-04-02 00:31

The backslash can be used to make a following quote not terminate the string:

>>> r'\''
"\\'"

So r'foo\' or r'\' are unterminated literals.

Rationale

Because you specifically asked for the reasoning behind this design decision, relevant aspects could be the following (although this is all based on speculation, of course):

  • Simplifies lexing for the Python interpreter itself (all string literals have the same semantics: A closing quote not followed by an odd number of backslashes terminates the string)
  • Simplifies lexing for syntax highlighting engines (this is a strong argument because most programming languages don't have raw strings that are still enclosed in single or double quotes and lots of syntax highlighting engines are badly broken because they use inappropriate tools like regular expressions to do the lexing)

So yes, there are probably important reasons why this way was chosen, even if you don't agree with these because you think that your specific use case is more important. It is however not, for the following reasons:

  • You can just use normal string literals and escape the backslashes or read the strings from a raw file
  • backslashes in string literals are typically needed in one of these two cases:
    • you provide the string as input to another language interpreter which uses backslashes as a quoting character, like regular expressions. In this case you won't ever need a backslash at the end of a string
    • you are using \ as a path separator, which is usually not necessary because Python supports / as a path separator on Windows and because there's os.path.sep.

Solutions

You can use '\\' or "\\" instead:

>>> print("\\")
\

Or if you're completely crazy, you can use raw string literal and combine them with normal literals just for the ending backslash or even use string slicing:

>>> r'C:\some\long\freakin\file\path''\\'
'C:\\some\\long\\freakin\\file\\path\\'
>>> r'C:\some\long\freakin\file\path\ '[:-1]
'C:\\some\\long\\freakin\\file\\path\\'

Or, in your particular case, you could just do:

paths = [ x.replace('/', '\\') for x in '''

  /bla/foo/bar
  /bla/foo/bloh
  /buff
  /

'''.strip().split()]

Which would save you some typing when adding more paths, as an additional bonus.

查看更多
beautiful°
5楼-- · 2020-04-02 00:33

This is in accordance with the documentation:

When an 'r' or 'R' prefix is present, a character following a backslash is included in the string without change, and all backslashes are left in the string. For example, the string literal r"\n" consists of two characters: a backslash and a lowercase 'n'. String quotes can be escaped with a backslash, but the backslash remains in the string; for example, r"\"" is a valid string literal consisting of two characters: a backslash and a double quote; r"\" is not a valid string literal (even a raw string cannot end in an odd number of backslashes). Specifically, a raw string cannot end in a single backslash (since the backslash would escape the following quote character). Note also that a single backslash followed by a newline is interpreted as those two characters as part of the string, not as a line continuation.

Use "\\" instead, or, better even, use / as path separator (yes, this works on Windows).

查看更多
Melony?
6楼-- · 2020-04-02 00:36

That is because in raw strings, you need a way to escape single quotes when the string is delimited by single quotes. Same with double quotes.

http://docs.python.org/reference/lexical_analysis.html#string-literals

查看更多
登录 后发表回答