I'm trying to analyse some SQLCMD scripts for code quality tests. I have a regex not working as expected:
^(\s*)USE (\[?)(?<![master|\$])(.)+(\]?)
I'm trying to match:
- Strings that start with USE (ignore whitespace)
- Followed by optional square bracket
- Followed by 1 or more non-whitespace characters.
- EXCEPT where that text is "master" (case insensitive)
- OR EXCEPT where that that text is a
$
symbol
Expected results:
USE [master]
- don't match
USE [$(CompiledDatabaseName)]
- don't match
USE [anything_else.01234]
- match
Also, the same patterns above without the [
and ]
characters.
I'm using Sublime Text 2 as my RegEx search tool and referencing this cheatsheet
Your pattern - ^(\s*)USE (\[?)(?<![master|\$])(.)+(\]?)
- uses a lookbehind that is variable-width (its length is not known beforehand) if you fix the character class issue inside it (i.e. replace [...]
with (...)
as you mean an alternative list of $
or a character sequence master
) and thus is invalid in a Boost regex. Your (.)+
capturing is wrong since this group will only contain one last character captured (you could use (.+)
), but this also matches spaces (while you need 1 or more non-whitespace characters). ?
is the one or zero times quantifier, but you say you might have 2 opening and closing brackets (so, you need a limiting quantifier {0,2}
).
You can use
^\h*USE(?!\h*\[{0,2}[^]\s]*(?:\$|(?i:master)))\h*\[{0,2}[^]\s]*]{0,2}
See regex demo
Explanation:
^
- start of a line in Sublime Text
\h*
- optional horizontal whitespace (if you need to match newlines, use \s*
)
USE
- a literal case-sensitive character sequence USE
(?!\h*\[{0,2}[^]\s]*(?:\$|(?i:master)))
- a negative lookahead that makes sure the USE
is NOT followed with:
\h*
- zero or more horizontal whitespace
\[{0,2}
- zero, one or two [
brackets
[^]\s]*
- zero or more characters other than ]
and whitespace
(?:\$|(?i:master))
- either a $
or a case-insensitive master
(we turn off case sensitivity with (?i:...)
construct)
\h*
- go on matching zero or more horizontal whitespace
\[{0,2}
- zero, one or two [
brackets
[^]\s]*
- zero or more characters other than ]
and whitespace (when ]
is the first character in a character class, it does not have to be escaped in Boost/PCRE regexps)
]{0,2}
- zero, one or two ]
brackets (outside of character class, the closing square bracket does not need escaping)