How to escape a square bracket for Pattern compila

2019-01-11 16:33发布

问题:

I have comma separated list of regular expressions:

.{8},[0-9],[^0-9A-Za-z ],[A-Z],[a-z]

I have done a split on the comma. Now I'm trying to match this regex against a generated password. The problem is that Pattern.compile does not like square brackets that is not escaped. Can some please give me a simple function that takes a string like so: [0-9] and returns the escaped string \[0-9\].

回答1:

You can use Pattern.quote(String).

From the docs:

public static String quote​(String s)

Returns a literal pattern String for the specified String.

This method produces a String that can be used to create a Pattern that would match the string s as if it were a literal pattern.

Metacharacters or escape sequences in the input sequence will be given no special meaning.



回答2:

For some reason, the above answer didn't work for me. For those like me who come after, here is what I found.

I was expecting a single backslash to escape the bracket, however, you must use two if you have the pattern stored in a string. The first backslash escapes the second one into the string, so that what regex sees is \]. Since regex just sees one backslash, it uses it to escape the square bracket.

\\] 

In regex, that will match a single closing square bracket.

If you're trying to match a newline, for example though, you'd only use a single backslash. You're using the string escape pattern to insert a newline character into the string. Regex doesn't see \n - it sees the newline character, and matches that. You need two backslashes because it's not a string escape sequence, it's a regex escape sequence.



回答3:

You can use the \Q and \E special characters...anything between \Q and \E is automatically escaped.

\Q[0-9]\E


回答4:

Pattern.compile() likes square brackets just fine. If you take the string

".{8},[0-9],[^0-9A-Za-z ],[A-Z],[a-z]"

and split it on commas, you end up with five perfectly valid regexes: the first one matches eight non-line-separator characters, the second matches an ASCII digit, and so on. Unless you really want to match strings like ".{8}" and "[0-9]", I don't see why you would need to escape anything.