In the documentation I read:
Use \A and \z to match the start and end of the string, ^ and $ match the start/end of a line.
I am going to apply a regular expression to check username (or e-mail is the same) submitted by user. Which expression should I use with validates_format_of
in model? I can't understand the difference: I've always used ^ and $ ...
Difference By Example
/^foo$/
matches any of the following,/\Afoo\z/
does not:/^foo$/
and/\Afoo\z/
all match the following:The start and end of a string may not necessarily be the same thing as the start and end of a line. Imagine if you used the following as your test string:
Notice that the string has many lines in it - the
^
and$
characters allow you to match the beginning and end of those lines (basically treating the\n
character as a delimeter) while\A
and\Z
allow you to match the beginning and end of the entire string.If you're depending on the regular expression for validation, you always want to use
\A
and\z
.^
and$
will only match up until a newline character, which means they could use an email likeme@example.com\n<script>dangerous_stuff();</script>
and still have it validate, since the regex only sees everything before the\n
.My recommendation would just be completely stripping new lines from a username or email beforehand, since there's pretty much no legitimate reason for one. Then you can safely use EITHER
\A
\z
or^
$
.According to Pickaxe:
So, use
\A
and lowercase\z
. If you use\Z
someone could sneak in a newline character. This is not dangerous I think, but might screw up algorithms that assume that there's no whitespace in the string. Depending on your regex and string-length constraints someone could use an invisible name with just a newline character.JavaScript's implementation of Regex treats
\A
as a literal'A'
(ref). So watch yourself out there and test.