I need a Perl regular expression to match a string. I'm assuming only double-quoted strings, that a \" is a literal quote character and NOT the end of the string, and that a \ is a literal backslash character and should not escape a quote character. If it's not clear, some examples:
"\"" # string is 1 character long, contains dobule quote
"\\" # string is 1 character long, contains backslash
"\\\"" # string is 2 characters long, contains backslash and double quote
"\\\\" # string is 2 characters long, contains two backslashes
I need a regular expression that can recognize all 4 of these possibilities, and all other simple variations on those possibilities, as valid strings. What I have now is:
/".*[^\\]"/
But that's not right - it won't match any of those except the first one. Can anyone give me a push in the right direction on how to handle this?
How about this?
/"([^\\"]|\\\\|\\")*"/
matches zero or more characters that aren't slashes or quotes OR two slashes OR a slash then a quote
/"(?:[^\\"]|\\.)*"/
This is almost the same as Cal's answer, but has the advantage of matching strings containing escape codes such as \n
.
The ?:
characters are there to prevent the contained expression being saved as a backreference, but they can be removed.
A generic solution(matching all backslashed characters):
/ \A " # Start of string and opening quote
(?: # Start group
[^\\"] # Anything but a backslash or a quote
| # or
\\. # Backslash and anything
)* # End of group
" \z # Closing quote and end of string
/xms
See Text::Balanced. It's better than reinvent wheel. Use gen_delimited_pat
to see result pattern and learn form it.
RegExp::Common is another useful tool to be aware of. It contains regexps for many common cases, included quoted strings:
use Regexp::Common;
my $str = '" this is a \" quoted string"';
if ($str =~ $RE{quoted}) {
# do something
}
Here's a very simple way:
/"(?:\\?.)*?"/
Just remember if you're embedding such a regex in a string to double the backslashes.
Try this piece of code : (\".+")