Why does Regexp have a timeout method, while in th

2020-08-01 06:06发布

This is a theoretical Computer Science question (Computation Theory).

I know that RegExps can take a very long time to calculate. However, from Theory of Computation we know that matching with a Regular Expression can be done extremely fast in a few clock cycles.

If RegExps are equivalent to Finite Automata, why RegExps have (or require) a timeout method? Using a DFA, the computation time for matching can be exteremely fast.

By RegExps I mean the Regular Expressions pattern matching classes in major languages; JavaScript, C#, etc.

Are common RegExps ("regex"s) not equivalent to the Regular Expressions in Theory of Automata (i.e. Regular Languages)?

For examples see: How do I timeout Regex operations to prevent hanging in .NET 4.5? and Regex Pattern Catastrophic backtracking .

If Regexp's matching require Backtracking, it means they are not equivalent to Regular Expressions.

If the languages captured by "Regexp"s are not Regular Languages, historically why (out of which necessity) were they extended?

If it that the resulting DFA will require a huge set of states?

3条回答
一夜七次
2楼-- · 2020-08-01 06:21

(out of which necessity) were they extended?

Regexp implementations were extended in systems in which the lack of a regexp feature requires difficult workarounds, such as writing a substantial amount of code in an inexpressive programming language. There is also the grave risk that the code might turn out to be correct, performant and robust against false positive matches.

查看更多
祖国的老花朵
3楼-- · 2020-08-01 06:31

A good reason is catastrophic backtracking, which explains why matching of some regexes will not return before the heat death of the universe.

查看更多
相关推荐>>
4楼-- · 2020-08-01 06:36

Because regex are not equivalent to the Regular Expressions in Theory of Automata.

They are more like cousins with extra functionalities that make them more complex and sometimes (depending on the regex) impossible to execute on long strings.

查看更多
登录 后发表回答