Regular expression for contents of parenthesis in

2019-07-30 17:04发布

How can I get contents of parenthesis in Racket? Contents may have more parenthesis. I tried:

(regexp-match #rx"((.*))" "(check)")

But the output has "(check)" three times rather than one:

'("(check)" "(check)" "(check)")

And I want only "check" and not "(check)".

Edit: for nested parenthesis, the inner block should be returned. Hence (a (1 2) c) should return "a (1 2) c".

1条回答
\"骚年 ilove
2楼-- · 2019-07-30 17:39

Parentheses are capturing and not matching.. so #rx"((.*))" makes two captures of everything. Thus:

(regexp-match #rx"((.*))" "any text")
; ==> ("any text" "any text" "any text")

The resulting list has the first as the whole match, then the first set of acpturnig paren and then the ones inside those as second.. If you want to match parentheses you need to escape them:

(regexp-match #rx"\\((.*)\\)" "any text")
; ==> #f
(regexp-match #rx"\\((.*)\\)" "(a (1 2) c)")
; ==> ("(a (1 2) c)" "a (1 2) c")

Now you see that the first element is the whole match, since the match might start at any location in the search string and end where the match is largest. The second element is the only one capture.

This will fail if the string has additional sets of parentheses. eg.

(regexp-match #rx"\\((.*)\\)" "(1 2 3) (a (1 2) c)")
; ==> ("(1 2 3) (a (1 2) c)" "1 2 3) (a (1 2) c")

It's because the expression isn't nesting aware. To be aware of it you need recursive reguler expression like those in Perl with (?R) syntax and friends, but racket doesn't have this (yet???)

查看更多
登录 后发表回答