How can I safely validate an untrusted regex in Pe

2019-06-27 19:08发布

This answer explains that to validate an arbitrary regular expression, one simply uses eval:

while (<>) {
    eval "qr/$_/;"
    print $@ ? "Not a valid regex: $@\n" : "That regex looks valid\n";
}

However, this strikes me as very unsafe, for what I hope are obvious reasons. Someone could input, say:

foo/; system('rm -rf /'); qr/

or whatever devious scheme they can devise.

The natural way to prevent such things is to escape special characters, but if I escape too many characters, I severely limit the usefulness of the regex in the first place. A strong argument can be made, I believe, that at least []{}()/-,.*?^$! and white space characters ought to be permitted (and probably others), un-escaped, in a user regex interface, for the regexes to have minimal usefulness.

Is it possible to secure myself from regex injection, without limiting the usefulness of the regex language?

2条回答
Ridiculous、
2楼-- · 2019-06-27 19:46

There is some discussion about this over at The Monastery.

TLDR: use re::engine::RE2 (-strict => 1);

Make sure at add (-strict => 1) to your use statement or re::engine::RE2 will fall back to perl's re.

The following is a quote from junyer, owner of the project on github.

RE2 was designed and implemented with an explicit goal of being able to handle regular expressions from untrusted users without risk. One of its primary guarantees is that the match time is linear in the length of the input string. It was also written with production concerns in mind: the parser, the compiler and the execution engines limit their memory usage by working within a configurable budget – failing gracefully when exhausted – and they avoid stack overflow by eschewing recursion.

查看更多
走好不送
3楼-- · 2019-06-27 19:51

The solution is simply to change

eval("qr/$_/")

to

eval("qr/\$_/")

This can be written more clearly as follows:

eval('qr/$_/')

But that's still not optimal. The following would be far better as it doesn't involve generating and compiling Perl code at run-time:

eval { qr/$_/ }

Note that neither solution protects you from denial of service attacks. It's quite easy to write a pattern that will take longer than the life of the universe to complete. To hand that situation, yYou could execute the regex match in a child for which CPU ulimit has been set.

查看更多
登录 后发表回答