Can I do a negation of a POSIX bracket expression?

2020-04-12 00:24发布

I know I can search for something matching a space with the POSIX bracket expression [[:space:]]. Can I do a search for something not matching a space using a POSIX bracket expression? In particular, the characters it should match include letters, and parentheses (().

[[:graph:]] looks somewhat vague:

[[:graph:]] - Non-blank character (excludes spaces, control characters, and similar)

标签: regex posix
3条回答
ゆ 、 Hurt°
2楼-- · 2020-04-12 00:31

It seems that this variation also does the trick:

/[[:^alpha:]]+/.match("ab12")

Results in:

#<MatchData "12">
查看更多
Summer. ? 凉城
3楼-- · 2020-04-12 00:33

You confuse two things here: a bracket expression and a POSIX character class. The outer [...] is a bracket expression, and it can be negated with a ^ that immediately follows [. A POSIX character class is a [:+name+:] construct that only works inside bracket expressions.

So, in your case, [[:space:]] pattern is a bracket expression containing just 1 POSIX character class that matches a whitespace:

  • [ - opening a bracket expression
    • [:space:] - a POSIX character class for whitespace
  • ] - closing bracket of the bracket expression.

To negate it, just add the ^ as in usual NFA character classes: [^[:space:]].

Note I deliberately differentiate the terms "bracket expression", "POSIX character class" and "character class" since POSIX and common NFA regex worlds adhere to different terminology.

查看更多
一夜七次
4楼-- · 2020-04-12 00:47

Well, if

'foo bar'[ /[[:space:]]/ ] # => " "

matches a space, why doesn't this work?

'foo bar'[ /[^[[:space:]]]/ ] # => "f"

For instance, something like this:

'foo bar'.scan(/[^[[:space:]]]+/) # => ["foo", "bar"]

It's important to remember that [[:space:]] is a character class, just as \s or \d or their negated versions are. Since \S is akin to [^\s] we can use [^[[:space:]]].


I think that should be [^[:space:]] since [:space:] is what expands inside the set notation [...].

I use the [[...]] form because that's what is documented in Regexp.

For clarity, here are some examples not using the double-brackets as shown in the documentation, but instead following the comments below:

'foo bar'[ /[[:space:]]/    ]# => " "
'foo bar'[ /[^[:space:]]/   ]# => "f"
'foo bar'[ /[^[[:space:]]]/ ]# => "f"

Note that this doesn't work:

'foo bar'[ /[:space:]/      ]# => "a"

/[:space:]/ is being interpreted by the regex engine as:

/[:space]/ 

which is a regular character-set, not a meta form. That's why it matches 'a' in "foo bar".

查看更多
登录 后发表回答