Perl regex substitution using external parameters

2020-03-02 18:35发布

Consider the following example:

my $text = "some_strange_thing";
$text =~ s/some_(\w+)_thing/no_$1_stuff/;
print "Result: $text\n";  

It prints

"Result: no_strange_stuff"

So far so good.

Now, I need to get both the match and replacement patterns from external sources (user input, config file, etc). Naive solution appears to be like this:

my $match = "some_(\\w+)_thing";
my $repl = "no_\$1_stuff";

my $text = "some_strange_thing";
$text =~ s/$match/$repl/;
print "Result: $text\n";  

However:

"Result: no_$1_stuff".

What's wrong? How can I get the same outcome with externally supplied patterns?

2条回答
地球回转人心会变
2楼-- · 2020-03-02 19:13

Solution 1: String::Substitution

Use String::Substitution package:

use String::Substitution qw(gsub_modify);

my $find = 'some_(\w+)_thing';
my $repl = 'no_$1_stuff';
my $text = "some_strange_thing";
gsub_modify($text, $find, $repl);
print $text,"\n";

The replacement string only interpolates (term used loosely) numbered match vars (like $1 or ${12}). See "interpolate_match_vars" for more information.
This module does not save or interpolate $& to avoid the "considerable performance penalty" (see perlvar).

Solution 2: Data::Munge

This is a solution mentioned by Grinnz in the comments below.

The Data::Munge can be used the following way:

use Data::Munge;

my $find = qr/some_(\w+)_thing/;
my $repl = 'no_$1_stuff';
my $text = 'some_strange_thing';
my $flags = 'g';
print replace($text, $find, $repl, $flags);
# => no_strange_stuff

Solution 3: A quick'n'dirty way (if replacement won't contain double quotes and security is not considered)

DISCLAIMER: I provide this solution as this approach can be found online, but its caveats are not explained. Do not use it in production.

With this approach, you can't have a replacement string that includes a " double quotation mark and, since this is equivalent to handing whoever is writing the configuration file direct code access, it should not be exposed to Web users (as mentioned by Daniel Martin).

You can use the following code:

#!/usr/bin/perl
my $match = qr"some_(\w+)_thing";
my $repl = '"no_$1_stuff"';
my $text = "some_strange_thing";
$text =~ s/$match/$repl/ee;
print "Result: $text\n";

See IDEONE demo

Result:

Result: no_strange_stuff

You have to

  1. Declare the replacement in '"..."' so as $1 could be later evaluated
  2. Use /ee to force the double evaluation of the variables in the replacement.

A modifier available specifically to search and replace is the s///e evaluation modifier. s///e treats the replacement text as Perl code, rather than a double-quoted string. The value that the code returns is substituted for the matched substring. s///e is useful if you need to do a bit of computation in the process of replacing text.

You can use qr to instantiate pattern for the regex (qr"some_(\w+)_thing").

查看更多
Summer. ? 凉城
3楼-- · 2020-03-02 19:27

Essentially the same approach as the accepted solution, but I kept the initial lines the same as the problem statement, since I thought that might make it easier to fit into more situations:

my $match = "some_(\\w+)_thing";
my $repl = "no_\$1_stuff";

my $qrmatch = qr($match);
my $code = $repl;

$code =~ s/([^"\\]*)(["\\])/$1\\$2/g;
$code = qq["$code"];

if (!defined($code)) {
  die "Couldn't find appropriate quote marks";
}

my $text = "some_strange_thing";
$text =~ s/$qrmatch/$code/ee;
print "Result: $text\n";

Note that this works no matter what is in $repl, whereas the naive solution has issues if $repl contains a double quote character itself, or ends in a backslash.

Also, assuming that you're going to run the three lines at the end (or something like it) in a loop, do make sure that you don't skip the qr line. It will make a huge performance difference if you skip the qr and just use s/$match/$code/ee.

Also, even though it's not as trivial to get arbitrary code execution with this solution as it is with the accepted one, it wouldn't surprise me if it's still possible. In general, I'd avoid solutions based on s///ee if the $match or $repl come from untrusted users. (e.g., don't build a web service out of this)

Doing this kind of replacement securely when $match and $repl are supplied by untrusted users should be asked as a different question if your use case includes that.

查看更多
登录 后发表回答