regex difference between vscode and visual studio

2020-04-12 09:33发布

问题:

regex difference between vscode and visual studio

starting with

line1
line2

find: ^(.+)$ replace: "$1",

In vscode it works as expected, resulting in

"line1",
"line2",

In studio, doesn't seem to work, resulting in

"line1
",
"line2
",

Which one is correct? I assume vscode.

回答1:

TL;DR: Use ^(.*[^\r\n]) to match a whole line without EOL characters.

According to the Docs:

| Purpose                                          | Expression | Example                                                                       |
|--------------------------------------------------|------------|-------------------------------------------------------------------------------|
| Match any single character (except a line break) | .          | a.o matches "aro" in "around" and "abo" in "about" but not "acro" in "across" |
| Anchor the match string to the end of a line     | \r?$       | car\r?$ matches "car" only when it appears at the end of a line               |
| Anchor the match string to the end of the file   | $          | car$ matches "car" only when it appears at the end of the file                |

However, some of that doesn't seem to hold true for some reason (i.e., . does match a line break and .$ does match the end of any line). All of the following patterns will match from the beginning to the end of the line including EOL characters: ^.+, ^.+$, ^.+\r?$.

I have noticed this behavior in VS2017 before and I'm not sure why it happens but I was able to get around it using something like the following:

^(.*[^\r\n])

Note: You can also get rid of the capturing group and replace with "$0",.



回答2:

In VSCode regex patterns, a dot . matches any char but any line break chars.

In .NET regex used in Visual Studio, a dot matches any char but a newline, LF, char.

This difference explains the results you get and you can't call them right or wrong, these are just regex engine differences.

Note you would not have noticed any difference between the two engines if you had used LF-only line endings, but Visual Studio in Windows uses CRLF endings by default.

In order to wrap a whole line with double quotes using .NET regex, just exclude both LF and CR (carriage return) symbols from matching by replacing the dot with a [^\r\n] negated character class:

^[^\r\n]+

And replace with "$&", pattern where $& refers to the whole match.

You may get rid of the capturing group in the VSCode regex and use the same replacement pattern as in .NET, too.