Assuming I have:
StartTest
NoInclude
EndTest
StartTest
Include
EndTest
and am using:
/StartTest(?!NoInclude)[\s\S]*?EndTest/g
Why am I matching both groups?
Regexr example: http://regexr.com/3db8m
Assuming I have:
StartTest
NoInclude
EndTest
StartTest
Include
EndTest
and am using:
/StartTest(?!NoInclude)[\s\S]*?EndTest/g
Why am I matching both groups?
Regexr example: http://regexr.com/3db8m
You fail the match with the lookahead if
NoInclude
appears straight afterStartTest
. You need a tempered greedy token:See the regex demo
The regex is matching
StartTest
, then matches any text that is notStartTest
,EndTest
orNoInclude
, up to theEndTest
.Since the
*
is greedy, it will make the.
match as much as it can. The negative lookahead will make it stop matching at the locations that are followed with the following alternatives:(?:Start|End)Test
-StartTest
orEndTest
NoInclude
- justNoInclude
.NOTE: The
(?s)
is an inline modifier (equivalent ofRegexOptions.Singleline
flag) that modifies the.
behavior in a pattern making it match LF (newlines), too. Without this modifier (or withoutRegexOptions.Singleline
) a dot matches any character but a newline.NOTE2: If you are testing a regex outside of the native code environment, make sure you are using an appropriate tester for your regex flavor. regexr.com only supports JavaScript flavor, regex101.com supports JS, PCRE and Python flavors, and RegexStorm.net/RegexHero.net support .NET flavor. There are many more testers around, read what they support and what not first.
Here is a C# demo: