Reading Chomsky hierarchy ... ... I know regexp can't parse type-2 grammars (context-free grammars), and also type-1 and type-0. Can regular expressions parse/catch ALL type-3 grammars (regular grammars)?
问题:
回答1:
Yes, provided they support alternation, concatenation, and the Kleene star. This is the case for regexes of the PCRE (Perl/Java/JavaScript/PHP/...) type: alternation is implemented by ((...)|(...))
, concatenation by (...)(...)
, and the Kleene star by (...)*
. (There are a few other details — in most of these languages you need to use something like \A
and \z
to indicate "start-of-string" and "end-of-string", which in a regular grammar is taken for granted — but that's the idea.)
But not everything called a "regular expression" in a programming context necessarily has all of the above; for example, POSIX Basic Regular Expressions supports only a very limited form of alternation, where all "branches" of the alternation consist of a single character (e.g., whereas PCREs has both (a|b|c)
and the special-case-equivalent [abc]
, POSIX BREs only have [abc]
, so can't express something like (ab|c)
).