I'm tasked with a making a shell script that swaps 2 strings and then outputs a file. The commands are similar to:
sed s/search_for/ replace/g output.txt > temp.dat
mv temp.dat output.txt
The script works like this:
./myScript var_A var_B output.file
Which I got to work fine. The second part does the same thing, but I must treat the following special characters as regular strings:
[ ] ^ * + . $ \ -
I have a general idea on how I want to tackle this (this may be the wrong way). I want to accept those characters and set them as variable with a \ appended in the front.
var_A=\\$1
var_B=\\$2
My issue is with the * (asterisk) and \ (backslash) characters. I'm using a simple test script to see what parameters I can easily convert to a variable:
for i in "$@"
do
echo "$i"
done
But the * char shows all the files in the directory and \ shows the next argument. I know about set -o noglob and set -f, but those will not work for me (and doesn't work on the script). I also know that you can escape using a backslash but I can't use that either. I must be able to take a special character (even * and /) and convert to a string. I hope this all makes sense and someone can help me.
If I understand correctly, you put patterns in variables, then you use these variables in sed
, and you need to treat the patterns as literal strings, without their special meaning in regular expressions?
If so, then before passing the patterns to sed
, you need to escape the special symbols. Here's a possible implementation with my tests:
#!/bin/sh
escaped() {
echo "$1" | sed -e 's/[].+-[$\\^*]/\\&/g'
}
set -- [ ] ^ \* + . \$ \\ -
for pat1; do
pat2=$(escaped "$pat1")
echo "$pat1 was $pat1" | sed -e s/$pat2/_/
done
The escaped
function takes the argument and puts a backslash in front of special characters. The loop demonstrates that the pat2
variable generated this way correctly matches the special characters in the input string.
If you want to perform literal replacements, sed
is the wrong tool for the job.
See the awk script given in http://mywiki.wooledge.org/BashFAQ/021. Quoted here:
# usage: gsub_literal STR REP
# replaces all instances of STR with REP. reads from stdin and writes to stdout.
gsub_literal() {
[[ $1 ]] || return
awk -v str="${1//\\/\\\\}" -v rep="${2//\\/\\\\}" '
BEGIN { len = length(str); }
{
out = "";
while (i = index($0, str)) {
out = out substr($0, 1, i-1) rep;
$0 = substr($0, i + len);
}
out = out $0;
print out;
}
'
}
...which can be used as...
tempfile=$(mktemp "$file.XXXXXX")
gsub_literal "$search" "$rep" \
<"$file" \
>"$tempfile" && \
mv -- "$tempfile" "$file"
with absolutely any values for $search
and $rep
.
Perl is also well-suited for operations of this type, having in-line replace functionality and (unlike sed) the ability to refer directly to its argv array or environment variables for literal search or replacement values.
You have to quote your patterns on the shell's command line. You can't work around that.
Perl regular expressions give you a "quotemeta" function that treats every character as literal
perl -e '
$str = q{this is a string with **emphasis**};
$pattern = q{**emphasis**};
$repl = "characters";
$str =~ s/$pattern/$repl/;
print $str
'
Quantifier follows nothing in regex; marked by <-- HERE in m/* <-- HERE *emphasis**/ at -e line 5.
but
perl -e '
$str = q{this is a string with **emphasis**};
$pattern = q{**emphasis**};
$repl = "characters";
$str =~ s/\Q$pattern\E/$repl/;
#.........^^
print $str
'
this is a string with characters