Regex: what does (?! …) mean?

2019-03-24 13:09发布

问题:

The following regex finds text between substrings FTW and ODP.

/FTW(((?!FTW|ODP).)+)ODP+/

What does the (?!...) do?

回答1:

(?!regex) is a zero-width negative lookahead. It will test the characters at the current cursor position and forward, testing that they do NOT match the supplied regex, and then return the cursor back to where it started.

The whole regexp:

/
 FTW           # Match Characters 'FTW'
 (             # Start Match Group 1
  (             # Start Match Group 2
   (?!FTW|ODP)   # Ensure next characters are NOT 'FTW' or 'ODP', without matching
   .             # Match one character
  )+            # End Match Group 2, Match One or More times
 )             # End Match Group 1
 OD            # Match characters 'OD'
 P+            # Match 'P' One or More times
/

So - Hunt for FTW, then capture while looking for ODP+ to end our string. Also ensure that the data between FTW and ODP+ doesn't contain FTW or ODP



回答2:

From perldoc:

A zero-width negative look-ahead assertion. For example /foo(?!bar)/ matches any occurrence of "foo" that isn't followed by "bar". Note however that look-ahead and look-behind are NOT the same thing. You cannot use this for look-behind.

If you are looking for a "bar" that isn't preceded by a "foo", /(?!foo)bar/ will not do what you want. That's because the (?!foo) is just saying that the next thing cannot be "foo"--and it's not, it's a "bar", so "foobar" will match. You would have to do something like /(?!foo)...bar/ for that. We say "like" because there's the case of your "bar" not having three characters before it. You could cover that this way: /(?:(?!foo)...|^.{0,2})bar/ . Sometimes it's still easier just to say:

if (/bar/ && $` !~ /foo$/)


回答3:

It means "not followed by...". Technically this is what's called a negative lookahead in that you can peek at what's ahead in the string without capturing it. It is a class of zero-width assertion, meaning that such expressions don't capture any part of the expression.



回答4:

The programmer must have been typing too fast. Some characters in the pattern got flipped. Corrected:

/WTF(((?!WTF|ODP).)+)ODP+/


回答5:

Regex

/FTW(((?!FTW|ODP).)+)ODP+/

matches first FTW immediately followed neither by FTW nor by ODP, then all following chars up to the first ODP (but if there is FTW somewhere in them there will be no match) then all the letters P that follow.

So in the string:

FTWFTWODPFTWjjFTWjjODPPPPjjODPPPjjj

it will match the bold part

FTWFTWODPFTWjjFTWjjODPPPPjjODPPPjjj



回答6:

'?!' is actually part of '(?! ... )', it means whatever is inside must NOT match at that location.