Google RE2 Regex Escaping periods and underscores

2019-08-14 14:19发布

问题:

I'm trying to validate a username string with the following characteristics:

  • Not start with a . or _
  • Not end with a .
  • Don't allow two . in a row
  • Only lowercase letter characters and numbers

my code is username.matches('^(?!\.)(?!_)(?!.*\.$)(?!.*?\.\.)[a-z0-9_.]+$')

Using a regex simulator online it's working

https://regex101.com/r/bDXMg3/2/

But using the same syntax in Google RE2 Syntax (used in Firestore Security Rules) is throwing a ton of errors

I tried to then double escape each .

using the code username.matches('^(?!\\.)(?!_)(?!.*\\.$)(?!.*?\\.\\.)[a-z0-9_.]+$')

It only shows one error (red ^ sign at the beginning), but then it gives me the error below

Invalid regular expression pattern. Pattern: ^(?!\.)(?!_)(?!.*\.$)(?!.*?\.\.)[a-z0-9_.]+$.

Can anyone let me know what I'm doing wrong?

回答1:

RE2 does not support lookaheads (nor lookbehinds).

However, the pattern can be re-written without lookarounds:

^[a-z0-9][a-z0-9_]*([.][a-z0-9_]+)*$

Details

  • ^ - start of string
  • [a-z0-9] - a letter or digit
  • [a-z0-9_]* - zero or more lowercase letters, digits, or underscores
  • ([.][a-z0-9_]+)* - zero or more sequences of
    • [.] - a dot
    • [a-z0-9_]+ - one or more lowercase letters, digits, or underscores
  • $ - end of string.


标签: regex re2