I am trying to create an application that matches a message template with a message that a user is trying to send. I am using Java regex for matching the message. The template/message may contain special characters.
How would I get the complete list of special characters that need to be escaped in order for my regex to work and match in the maximum possible cases?
Is there a universal solution for escaping all special characters in Java regex?
You can look at the javadoc of the Pattern class: http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html
You need to escape any char listed there if you want the regular char and not the special meaning.
As a maybe simpler solution, you can put the template between \Q and \E - everything between them is considered as escaped.
To escape you could just use this from Java 1.5:
Pattern.quote("$test");
You will match exacty the word $test
According to the String Literals / Metacharacters documentation page, they are:
<([{\^-=$!|]})?*+.>
Also it would be cool to have that list refereed somewhere in code, but I don't know where that could be...
On @Sorin's suggestion of the Java Pattern docs, it looks like chars to escape are at least:
\.[{(*+?^$|
Combining what everyone said, I propose the following, to keep the list of characters special to RegExp clearly listed in their own String, and to avoid having to try to visually parse thousands of "\\"'s. This seems to work pretty well for me:
final String regExSpecialChars = "<([{\\^-=$!|]})?*+.>";
final String regExSpecialCharsRE = regExSpecialChars.replaceAll( ".", "\\\\$0");
final Pattern reCharsREP = Pattern.compile( "[" + regExSpecialCharsRE + "]");
String quoteRegExSpecialChars( String s)
{
Matcher m = reCharsREP.matcher( s);
return m.replaceAll( "\\\\$0");
}
on the other side of the coin, you should use "non-char" regex that looks like this if special characters = allChars - number - ABC - space in your app context.
String regepx = "[^\\s\\w]*";
Not sure I fully understand your question, but I think you should look at
Matcher.quoteReplacement()