可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Edit: tchrist has informed me that my original accusations about Perl's insecurity are unfounded. However, the question still stands.

~~I know that in Perl, you can embed arbitrary code in a regular expression, so obviously accepting a user-supplied regex and matching it allows arbitrary code execution and is a clear security hole.~~ But is this true for all languages that use regular expressions? Is it true for all languages that use "Perl-compatible" regular expressions? In which languages are user-supplied regexes safe to use, and in which languages do they allow arbitrary code execution or other security holes?

回答1:

In most languages allowing users to supply regular expression means that you allow for a denial of service attack.

Some types of regular expressions are extremely cpu intensive to execute. So in general it's a bad idea to allow users to enter regular expressions that will be executed on a remote system.

For more info, read this page: http://www.regular-expressions.info/catastrophic.html

回答2:

This is not true: you cannot execute code callbacks in Perl by sneaking them in an evaluated regex. This is forbidden. You have to specifically override that with a lexically scoped

use re "eval";

if you expect to have both interpolation and code escapes happening in the same pattern.

Watch:

% perl -le '$x = "(?{ die 'naughty' })"; "aaa" =~ /$x/'
Eval-group not allowed at runtime, use re 'eval' in regex m/(?{ die naughty })/ at -e line 1.
Exit 255

% perl -Mre=eval -le '$x = "(?{ die 'naughty' })"; "aaa" =~ /$x/'
naughty at (re_eval 1) line 1.
Exit 255

回答3:

It's generally dynamic languages with an eval facility that tend to have the ability to execute code from regular expressions. In static languages (i.e. those requiring a separate compilation step) there is generally no way to execute code that wasn't compiled, so evaluating code from within a regex is impossible.

Without a way to embed code in a regex, the worst a user can do is write a regex that takes a long time to evaluate.

回答4:

1)Vulnerabilities are found in regex libraries, such as this buffer overflow that affects Webkit and allows any attacker to gain remote code execution by accessing the regex library from javascript.

2)It is a DoS condition in C#.

3)User supplied regex's can be for php because of modifiers. Adding the /e modifier evals the match. In this case system will be eval()'ed.

preg_replace("/.*/e","system('echo /etc/passwd')");

Or in the form of a vulnerability:

preg_replace($_GET['regex'],$_GET['check']);

回答5:

Regular expressions are a programming language. I don't think they're quite Turing-complete, but they're close enough that allowing your users to enter them into your web site IS allowing other people to run code on your server. QED, yes, it's a security hole.

You might be able to get away with allowing a subset of whatever regexp language you want to use, whitelist a particular set of constructs to make it a not-big-enough-to-sweat-over hole... other people have already mentioned the possible dooms of nesting and * . How much you're willing to let people load down your server is up to you. Personally, I'd be comfortable with letting 'em have one SQL "CONTAINS" statement and maybe a "BETWEEN()". :)

回答6:

I suspect ruby would allow /#{system("rm -rf really_important_directory")}/ - is that the kind of thing you're worried about?

回答7:

AFAIK, you can do it safely in C#: you can supply the regex string to the Regex constructor, and if it fails to parse it'll throw. I'm not sure about others.