I need to remove some substrings in strings (in a large dataset). The substrings often contain special characters, like these: ., ^, /,... and replaceAll() would treat them as special characters for regex, such as a dot would match any character, which is not really what I want.
Is there other functions to do the "replace" without treating the first argument as regex?
Just use String.replace(). It functions the same way, but it deals with escaping the special characters internally to avoid you having to worry about regex.
Documentation
You can match literally. For instance, if we want to match "<.]}^", we can do:
Pattern pat=Pattern.compile("<.]}^", PATTERN.LITERAL");
and use that pattern.
You can also use backslashes to escape it. Note that the string literal itself needs backslashes, so escaping a single dot will take two backslashes, as follows:
Pattern pat=Pattern.compile("\\.");
First backslash is seen by compiler, and second backslash is taken as a backslash for the regex parser.
Just use String.replace(String, String)
, not replaceAll
. String.replace
doesn't treat its argument as a regex.
There are 2 methods named replace
in the String
class that perform replacement without treating their parameters as regular expressions.
One replace
method replaces one char
with another char
.
The other replace
method replaces a CharSequence
(usually a String
) with another CharSequence
.
Quoting the Javadocs from the second replace
method:
Replaces each substring of this string that matches the literal target
sequence with the specified literal replacement sequence.
Is there other functions to do the "replace"
Yes, it is called replace
:) Main difference between it and replaceAll
is that it escapes regex special characters.
BTW if you want to escape regex's special characters in string you can
- use
yourString = Pattern.quote(yourString)
,
- surround it with
"\\Q"
and "\\E"
,
to escape only some special characters you can
- use
"\\"
before them like \\.
- also most special characters can be escaped by surrounding them with
"["
and "]"
like [.]
.