Regex expressions in Java, \\\\s vs. \\\\s+

2020-02-16 05:54发布

问题:

What's the difference between the following two expressions?

x = x.replaceAll("\\s", "");
x = x.replaceAll("\\s+", "");

回答1:

The first one matches a single whitespace, whereas the second one matches one or many whitespaces. They're the so-called regular expression quantifiers, and they perform matches like this (taken from the documentation):

Greedy quantifiers
X?  X, once or not at all
X*  X, zero or more times
X+  X, one or more times
X{n}    X, exactly n times
X{n,}   X, at least n times
X{n,m}  X, at least n but not more than m times

Reluctant quantifiers
X?? X, once or not at all
X*? X, zero or more times
X+? X, one or more times
X{n}?   X, exactly n times
X{n,}?  X, at least n times
X{n,m}? X, at least n but not more than m times

Possessive quantifiers
X?+ X, once or not at all
X*+ X, zero or more times
X++ X, one or more times
X{n}+   X, exactly n times
X{n,}+  X, at least n times
X{n,m}+ X, at least n but not more than m times


回答2:

Those two replaceAll calls will always produce the same result, regardless of what x is. However, it is important to note that the two regular expressions are not the same:

  • \\s - matches single whitespace character
  • \\s+ - matches sequence of one or more whitespace characters.

In this case, it makes no difference, since you are replacing everything with an empty string (although it would be better to use \\s+ from an efficiency point of view). If you were replacing with a non-empty string, the two would behave differently.



回答3:

First of all you need to understand that final output of both the statements will be same i.e. to remove all the spaces from given string.

However x.replaceAll("\\s+", ""); will be more efficient way of trimming spaces (if string can have multiple contiguous spaces) because of potentially less no of replacements due the to fact that regex \\s+ matches 1 or more spaces at once and replaces them with empty string.

So even though you get the same output from both it is better to use:

x.replaceAll("\\s+", "");


回答4:

The first regex will match one whitespace character. The second regex will reluctantly match one or more whitespace characters. For most purposes, these two regexes are very similar, except in the second case, the regex can match more of the string, if it prevents the regex match from failing.  from http://www.coderanch.com/t/570917/java/java/regex-difference