Thanks for the previous assistance everyone!. I have a query regarding RegExp in Perl
My issue is..
I know, when matching you can write m// or // or ## (must include m or s if you use this). What is causing me the confusion is a book example on escaping characters I have. I believe most people escape lots of characters, as a sure fire way of the program working without missing a metacharacter something ie: \@ when looking to match @ say in an email address.
Here's my issue and I know what this script does:
$date= "15/12/99"
$date=~ s#(\d+)/(\d+)/(\d+)#$1/$2/$3#; << why are no forward slashes escaped??
print($date);
Yet the later example I have, shows it rewritten, as (which i also understand and they're escaped)
$date =~ s/()(\d+)\/(\d+)\/(d+)/$2\/$1\/$3; <<<<which is escaping the forward slashes.
I know the slashes or hashes are programmer preference and their use. What I don't understand is why the second example, escapes the slashes, yet the first doesn't - I have tried and they work both ways. No escaping slashes with hashes? What's even MORE confusing is, looking at yet another book example I also have earlier to this one, using hashes again, they too escape the @ symbol.
if ($address =~ m#\@#) { print("That's an email address"); }
or something similar
So what do you escape from what you don't using hashes or slashes? I know you have to escape metacharacters to match them but I'm confused.
The forward slashes are not meta characters in themselves - only the use of them in the second example as expression separators makes them "special".
The format of a substitute expression is:
In the first example, using a hash as the first character after the =~ s, makes that character the expression separator, so forward slash is not special and does not require any escaping. in the second example, the expression separator is indeed the forward slash, so it must be escaped within the expressions themselves.
When you build a regexp, you define a character as a delimiter for your regexp i.e. doing
//
or##
.If you need to use that character inside your regexp, you will need to escape it so that the regexp engine does not see it as the end of the regexp.
If you build your regexp between forward slashes
/
, you will need to escape the forward slashes contained in your regexp, hence the escaping in your second example.Of course, the same rule apply with any character you use as a regexp delimiter, not just forward slashes.
The regex match-operator allows to define i a custom non-whitespace-character as seperator.
In your first example the '#' is used as seperator. So in this regex you don't need to escape the '/' because it hase no special meaning. In the second regex, the seperator char isn't changed. So the default '/' is used. Now you have to escape all '/' in your pattern. Otherwise the parser is confused. :)
If you are not use slashes, the recommend practice is to use the curly braces and the /x modifier.
Escaping the non-alphanumerics is also a standard even if they are not meta-characters. See
perldoc -f quotemeta
.There is another depth to this question about escaping forward slashes with the s modifier. With my example the capturing becomes the problem.
For this to work the typo with the addition of a second forward slash, had to be captured. Also, trying to work with just the two slashes did not work. The first slash has to be led by more than one character.
Changing "http://world.com/Photos//space_shots/out_of_this_world.jpg"
To: "http://world.com/Photos/space_shots/out_of_this_world.jpg"
The question itself has been properly answered in several answers. But everything you always wanted to know about Perl regular expressions, but may or may not have been afraid to ask, can be found in perldoc perlre, perldoc perlrequick and perldoc perlretut. I recommend you read through them.