Regular expression replace except first and last c

2020-04-11 10:59发布

问题:

What is a regular expression to replace doublequotes (") in a string with escape backslash followed by doublequotes (\") except at the first and last characters of the string.

Example 1: Double quote embedded in a string

Input: "This is a "Test""
Expected Output: "This is a \"Test\""

Example 2: No double quotes in the middle of the string

Input: "This is a Test"
Expected Output: "This is a Test"

When I perform a re.sub() operation in python, everything including the first and last doublequote characters are getting replaced. In my example above, the output string becomes: \"This is a Test\".

回答1:

As pointed out by @mgilson, you can just slice the first and last characters off so this regex is basically pointless

>>> print re.sub(r'(?<!^)"(?!$)', '\\"', '"This is a "Test""')
"This is a \"Test\""
>>> print re.sub(r'(?<!^)"(?!$)', '\\"', '"This is a Test"')
"This is a Test"


回答2:

I don't know about you, but I'd do it the easy way:

'"{}"'.format(s[1:-1].replace('"',r'\"'))

Of course, this makes a whole bunch of assumptions -- The strongest being that the first and last characters are always double quotes ...

Maybe this is a little better:

'{0}{1}{2}'.format(s[0],s[1:-1].replace('"',r'\"'),s[-1])

which preserves the first and last characters and escapes all double quotes in the middle.



回答3:

Unfortunately, I don't think you can do that with a single regex. You can fake it, though, with three regexes.

>>> x = '"This is "what" it is"'
>>> print x
"This is "what" it is"
>>> x = re.sub(r'"',r'\\"',x)
>>> print x
\"This is \"what\" it is\"
>>> x = re.sub(r'^\\"','"',x)
>>> print x
"This is \"what\" it is\"
>>> x = re.sub(r'\\"$','"',x)
>>> print x
"This is \"what\" it is"

The first regex changes all quotes into escaped quotes.

The second regex changes the leading quote back (no effect if no leading quote present).

The third regex changes the trailing quote back (no effect if no trailing quote present).