I need to match @anything_here@
from a string @anything_here@dhhhd@shdjhjs@
. So I'd used following regex.
^@.*?@
or
^@[^@]*@
Both way it's work but I would like to know which one would be a better solution. Regex with non-greedy repetition or regex with negated character class?
Negated character classes should usually be prefered over lazy matching, if possible.
If the regex is successful,
^@[^@]*@
can match the content between@
s in a single step, while^@.*?@
needs to expand for each character between@
s.When failing (for the case of no ending
@
) most regex engines will apply a little magic and internally treat[^@]*
as[^@]*+
, as there is a clear cut border between@
and non-@
, thus it will match to the end of the string, recognize the missing@
and not backtrack, but instantly fail..*?
will expand character for character as usual.When used in larger contexts,
[^@]*
will also never expand over the borders of the ending@
while this is very well possible for the lazy matching. E.g.^@[^@]*a[^@]*@
won't match@bbbb@a@
while^@.*?a.*?@
will.Note that
[^@]
will also match newlines, while.
doesn't (in most regex engines and unless used in singleline mode). You can avoid this by adding the newline character to the negation - if it is not wanted.It is clear the
^@[^@]*@
option is much better.The negated character class is quantified greedily which means the regex engine grabs 0 or more chars other than
@
right away, as many as possible. See this regex demo and matching:When you use a lazy dot matching pattern, the engine matches
@
, then tries to match the trailing@
(skipping the.*?
). It does not find the@
at Index 1, so the.*?
matches thea
char. This.*?
pattern expands as many times as there are chars other than@
up to the first@
.See the lazy dot matching based pattern demo here and here is the matching steps: