Consider the following example:
my $text = "some_strange_thing";
$text =~ s/some_(\w+)_thing/no_$1_stuff/;
print "Result: $text\n";
It prints
"Result: no_strange_stuff"
So far so good.
Now, I need to get both the match and replacement patterns from external sources (user input, config file, etc).
Naive solution appears to be like this:
my $match = "some_(\\w+)_thing";
my $repl = "no_\$1_stuff";
my $text = "some_strange_thing";
$text =~ s/$match/$repl/;
print "Result: $text\n";
However:
"Result: no_$1_stuff".
What's wrong? How can I get the same outcome with externally supplied patterns?
Solution 1: String::Substitution
Use String::Substitution
package:
use String::Substitution qw(gsub_modify);
my $find = 'some_(\w+)_thing';
my $repl = 'no_$1_stuff';
my $text = "some_strange_thing";
gsub_modify($text, $find, $repl);
print $text,"\n";
The replacement string only interpolates (term used loosely) numbered match vars (like $1
or ${12}
). See "interpolate_match_vars" for more information.
This module does not save or interpolate $&
to avoid the "considerable performance penalty" (see perlvar).
Solution 2: Data::Munge
This is a solution mentioned by Grinnz in the comments below.
The Data::Munge
can be used the following way:
use Data::Munge;
my $find = qr/some_(\w+)_thing/;
my $repl = 'no_$1_stuff';
my $text = 'some_strange_thing';
my $flags = 'g';
print replace($text, $find, $repl, $flags);
# => no_strange_stuff
Solution 3: A quick'n'dirty way (if replacement won't contain double quotes and security is not considered)
DISCLAIMER: I provide this solution as this approach can be found online, but its caveats are not explained. Do not use it in production.
With this approach, you can't have a replacement string that includes a "
double quotation mark and, since this is equivalent to handing whoever is writing the configuration file direct code access, it should not be exposed to Web users (as mentioned by Daniel Martin).
You can use the following code:
#!/usr/bin/perl
my $match = qr"some_(\w+)_thing";
my $repl = '"no_$1_stuff"';
my $text = "some_strange_thing";
$text =~ s/$match/$repl/ee;
print "Result: $text\n";
See IDEONE demo
Result:
Result: no_strange_stuff
You have to
- Declare the replacement in
'"..."'
so as $1
could be later evaluated
- Use
/ee
to force the double evaluation of the variables in the replacement.
A modifier available specifically to search and replace is the s///e
evaluation modifier. s///e
treats the replacement text as Perl code, rather than a double-quoted string. The value that the code returns is substituted for the matched substring. s///e
is useful if you need to do a bit of computation in the process of replacing text.
You can use qr
to instantiate pattern for the regex (qr"some_(\w+)_thing"
).
Essentially the same approach as the accepted solution, but I kept the initial lines the same as the problem statement, since I thought that might make it easier to fit into more situations:
my $match = "some_(\\w+)_thing";
my $repl = "no_\$1_stuff";
my $qrmatch = qr($match);
my $code = $repl;
$code =~ s/([^"\\]*)(["\\])/$1\\$2/g;
$code = qq["$code"];
if (!defined($code)) {
die "Couldn't find appropriate quote marks";
}
my $text = "some_strange_thing";
$text =~ s/$qrmatch/$code/ee;
print "Result: $text\n";
Note that this works no matter what is in $repl
, whereas the naive solution has issues if $repl
contains a double quote character itself, or ends in a backslash.
Also, assuming that you're going to run the three lines at the end (or something like it) in a loop, do make sure that you don't skip the qr
line. It will make a huge performance difference if you skip the qr
and just use s/$match/$code/ee
.
Also, even though it's not as trivial to get arbitrary code execution with this solution as it is with the accepted one, it wouldn't surprise me if it's still possible. In general, I'd avoid solutions based on s///ee
if the $match
or $repl
come from untrusted users. (e.g., don't build a web service out of this)
Doing this kind of replacement securely when $match
and $repl
are supplied by untrusted users should be asked as a different question if your use case includes that.