I know I can search for something matching a space with the POSIX bracket expression [[:space:]]
. Can I do a search for something not matching a space using a POSIX bracket expression? In particular, the characters it should match include letters, and parentheses ((
).
[[:graph:]]
looks somewhat vague:
[[:graph:]]
- Non-blank character (excludes spaces, control characters, and similar)
You confuse two things here: a bracket expression and a POSIX character class. The outer [...]
is a bracket expression, and it can be negated with a ^
that immediately follows [
. A POSIX character class is a [:
+name
+:]
construct that only works inside bracket expressions.
So, in your case, [[:space:]]
pattern is a bracket expression containing just 1 POSIX character class that matches a whitespace:
[
- opening a bracket expression
[:space:]
- a POSIX character class for whitespace
]
- closing bracket of the bracket expression.
To negate it, just add the ^
as in usual NFA character classes: [^[:space:]]
.
Note I deliberately differentiate the terms "bracket expression", "POSIX character class" and "character class" since POSIX and common NFA regex worlds adhere to different terminology.
Well, if
'foo bar'[ /[[:space:]]/ ] # => " "
matches a space, why doesn't this work?
'foo bar'[ /[^[[:space:]]]/ ] # => "f"
For instance, something like this:
'foo bar'.scan(/[^[[:space:]]]+/) # => ["foo", "bar"]
It's important to remember that [[:space:]]
is a character class, just as \s
or \d
or their negated versions are. Since \S
is akin to [^\s]
we can use [^[[:space:]]]
.
I think that should be [^[:space:]] since [:space:] is what expands inside the set notation [...].
I use the [[...]]
form because that's what is documented in Regexp.
For clarity, here are some examples not using the double-brackets as shown in the documentation, but instead following the comments below:
'foo bar'[ /[[:space:]]/ ]# => " "
'foo bar'[ /[^[:space:]]/ ]# => "f"
'foo bar'[ /[^[[:space:]]]/ ]# => "f"
Note that this doesn't work:
'foo bar'[ /[:space:]/ ]# => "a"
/[:space:]/
is being interpreted by the regex engine as:
/[:space]/
which is a regular character-set, not a meta form. That's why it matches 'a'
in "foo bar".
It seems that this variation also does the trick:
/[[:^alpha:]]+/.match("ab12")
Results in:
#<MatchData "12">