I am using Java regexes in Java 1.6 (inter alia to parse numeric output) and cannot find a precise definition of \b
("word boundary"). I had assumed that -12
would be an "integer word" (matched by \b\-?\d+\b
) but it appears that this does not work. I'd be grateful to know of ways of matching space-separated numbers.
Example:
Pattern pattern = Pattern.compile("\\s*\\b\\-?\\d+\\s*");
String plus = " 12 ";
System.out.println(""+pattern.matcher(plus).matches());
String minus = " -12 ";
System.out.println(""+pattern.matcher(minus).matches());
pattern = Pattern.compile("\\s*\\-?\\d+\\s*");
System.out.println(""+pattern.matcher(minus).matches());
This returns:
true
false
true
I believe that your problem is due to the fact that
-
is not a word character. Thus, the word boundary will match after the-
, and so will not capture it. Word boundaries match before the first and after the last word characters in a string, as well as any place where before it is a word character or non-word character, and after it is the opposite. Also note that word boundary is a zero-width match.One possible alternative is
This will match any numbers starting with a space character and an optional dash, and ending at a word boundary. It will also match a number starting at the beginning of the string.
Word boundary \b is used where one word should be a word character and another one a non-word character. Regular Expression for negative number should be
check working DEMO
A word boundary can occur in one of three positions:
Word characters are alpha-numeric; a minus sign is not. Taken from Regex Tutorial.
I think it's the boundary (i.e. character following) of the last match or the beginning or end of the string.
In the course of learning regular expression, I was really stuck in the metacharacter which is
\b
. I indeed didn't comprehend its meaning while I was asking myself "what it is, what it is" repetitively. After some attempts by using the website, I watch out the pink vertical dashes at the every beginning of words and at the end of words. I got it its meaning well at that time. It's now exactly word(\w
)-boundary.My view is merely to immensely understanding-oriented. Logic behind of it should be examined from another answers.