How can I match double-quoted strings with escaped

2019-01-08 15:49发布

问题:

I need a Perl regular expression to match a string. I'm assuming only double-quoted strings, that a \" is a literal quote character and NOT the end of the string, and that a \ is a literal backslash character and should not escape a quote character. If it's not clear, some examples:

"\""    # string is 1 character long, contains dobule quote
"\\"    # string is 1 character long, contains backslash
"\\\""  # string is 2 characters long, contains backslash and double quote
"\\\\"  # string is 2 characters long, contains two backslashes

I need a regular expression that can recognize all 4 of these possibilities, and all other simple variations on those possibilities, as valid strings. What I have now is:

/".*[^\\]"/

But that's not right - it won't match any of those except the first one. Can anyone give me a push in the right direction on how to handle this?

回答1:

How about this?

/"([^\\"]|\\\\|\\")*"/

matches zero or more characters that aren't slashes or quotes OR two slashes OR a slash then a quote



回答2:

/"(?:[^\\"]|\\.)*"/

This is almost the same as Cal's answer, but has the advantage of matching strings containing escape codes such as \n.

The ?: characters are there to prevent the contained expression being saved as a backreference, but they can be removed.



回答3:

A generic solution(matching all backslashed characters):

/ \A "               # Start of string and opening quote
  (?:                #  Start group
    [^\\"]           #   Anything but a backslash or a quote
    |                #  or
    \\.              #   Backslash and anything
  )*                 # End of group
  " \z               # Closing quote and end of string
  /xms


回答4:

See Text::Balanced. It's better than reinvent wheel. Use gen_delimited_pat to see result pattern and learn form it.



回答5:

RegExp::Common is another useful tool to be aware of. It contains regexps for many common cases, included quoted strings:

use Regexp::Common;

my $str = '" this is a \" quoted string"';
if ($str =~ $RE{quoted}) {
  # do something
}


回答6:

Here's a very simple way:

/"(?:\\?.)*?"/

Just remember if you're embedding such a regex in a string to double the backslashes.



回答7:

Try this piece of code : (\".+")