Over the years, "regex" pattern matching has been getting more and more powerful to the point where I wonder: is it really just context-sensitive-grammar matching? Is it a variation/extension of context-free-grammar matching? Where is it right now and why don't we just call it that instead of the old, restrictive "regular expression"?
相关问题
- Improve converting string to readable urls
- Regex to match charset
- Regex subsequence matching
- Accommodate two types of quotes in a regex
- Set together letters and numbers that are ordinal
相关文章
- Optimization techniques for backtracking regex imp
- Regex to check for new line
- Allow only 2 decimal points entry to a textbox usi
- Comparing speed of non-matching regexp
- Regular expression to get URL in string swift with
- 请问如何删除之前和之后的非字母中文单字
- Lazy (ungreedy) matching multiple groups using reg
- when [:punct:] is too much [duplicate]
There are features in modern regular expression implementations that break the rules of the classic regular expression definition.
For example Microsoft’s .NET Balancing Group
(?<
name1
-
name2
> … )
:This does match the language L₀₁ = {ε, 01, 0011, 000111, … }. But this language is not regular according to the Pumping Lemma.
The way I see it:
AllMost human languagesI do know of regular expression parsers that allow you to match against something the parser has already encountered, achieving something like a context-sensitive grammar.
Still, regular expression parsers, however sophisticated they may be, don't allow for recursive application of rules, which is a definite requirement for context-free grammars.
The term regex, in my opinion, mostly refers to the syntax used to express those regular grammars (the stars and question marks).
In particular backreferences to capturing parentheses make regular expressions more complex than regular, context-free, or context-sensitive grammars. The name is simply historically grown (as many words). See also this section in Wikipedia and this explanation with an example from Perl.