I found these things in my regex body but I haven't got a clue what I can use them for. Does somebody have examples so I can try to understand how they work?
(?!) - negative lookahead
(?=) - positive lookahead
(?<=) - positive lookbehind
(?<!) - negative lookbehind
(?>) - atomic group
Lookarounds are zero width assertions. They check for a regex (towards right or left of the current position - based on ahead or behind), succeeds or fails when a match is found (based on if it is positive or negative) and discards the matched portion. They don't consume any character - the matching for regex following them (if any), will start at the same cursor position.
Read regular-expression.info for more details.
Syntax:
Match only if REGEX_1 matches; after matching REGEX_1, the match is discarded and searching for REGEX_2 starts at the same position.
example:
REGEX_1 is
[a-z0-9]{4}$
which matches four alphanumeric chars followed by end of line.REGEX_2 is
[a-z]{1,2}[0-9]{2,3}
which matches one or two letters followed by two or three digits.REGEX_1 makes sure that the length of string is indeed 4, but doesn't consume any characters so that search for REGEX_2 starts at the same location. Now REGEX_2 makes sure that the string matches some other rules. Without look-ahead it would match strings of length three or five.
Syntax:
Match only if REGEX_1 does not match; after checking REGEX_1, the search for REGEX_2 starts at the same position.
example:
The look-ahead part checks for the
FWORD
in the string and fails if it finds it. If it doesn't findFWORD
, the look-ahead succeeds and the following part verifies that the string's length is between 10 and 30 and that it contains only word charactersa-zA-Z0-9_
Look-behind is similar to look-ahead: it just looks behind the current cursor position. Some regex flavors like javascript doesn't support look-behind assertions. And most flavors that support it (PHP, Python etc) require that look-behind portion to have a fixed length.
Examples
Given the string
foobarbarfoo
:You can also combine them:
Definitions
Look ahead positive
(?=)
Find expression A where expression B follows:
Look ahead negative
(?!)
Find expression A where expression B does not follow:
Look behind positive
(?<=)
Find expression A where expression B precedes:
Look behind negative
(?<!)
Find expression A where expression B does not precede:
Atomic groups
(?>)
An atomic group exits a group and throws away alternative patterns after the first matched pattern inside the group (backtracking is disabled).
(?>foo|foot)s
applied tofoots
will match its 1st alternativefoo
, then fail ass
does not immediately follow, and stop as backtracking is disabledA non-atomic group will allow backtracking; if subsequent matching ahead fails, it will backtrack and use alternative patterns until a match for the entire expression is found or all possibilities are exhausted.
(foo|foot)s
applied tofoots
will:foo
, then fail ass
does not immediately follow infoots
, and backtrack to its 2nd alternative;foot
, then succeed ass
immediately follows infoots
, and stop.Some resources
Grokking lookaround rapidly.
How to distinguish lookahead and lookbehind? Take 2 minutes tour with me:
Suppose
Now, we ask B, Where are you?
B has two solutions to declare it location:
One, B has A ahead and has C bebind
Two, B is ahead(lookahead) of C and behind (lookhehind) A.
As we can see, the behind and ahead are opposite in the two solutions.
Regex is solution Two.