Handling metacharacters in search strings

2019-08-29 06:59发布

问题:

I have a user input that would be used in a search string that may contain a metacharacter

For e.g. C# or C++

my grep command in a function was:

grep -E "$1|$2" test.txt

under direct replacement:

grep -E "C\+\+|testWord" test.txt
grep -E "C\#|testWord" test.txt

the first caught the lines fine but not the second. Strangely, # was completely ignored. Without direct replacement, both catch anything with c followed by testWord instead of c++ and c# respectively

I've tried handling it using sed

$temp = `echo $1 | sed 's/[\#\!\&\;\`\"\'\|\*\?\~\<\>\^\(\)\[\]\{\}\$\+\\]/\\&/g'`

but it doesn't work right. Or is there any other way to handle user input with metacharacters?

Thanks in advance

回答1:

if you are passing the input as arguments to the script

#!/bin/bash

input1="$1"
input2="$2"
while read -r line
do
    case "$line" in
        *$input1*|*$input2* ) echo "found: $line";;
    esac
done  <"BooksDB.txt

"

output

$ cat file
this is  a line
this line has C++ and C#
this line has only C++ and that's it
this line has only C# and that's it
this is end line Caa

$ ./shell.sh C++ C#
found: this line has C++ and C#
found: this line has only C++ and that's it
found: this line has only C# and that's it

if you are getting input from read

read -p "Enter input1:" input1
read -p "Enter input2:" input2
while read -r line
do
    case "$line" in
        *$input1|*$input2* ) echo "found: $line";;
    esac
done <"BooksDB.txt"


回答2:

This works for me:

$ testfun1(){ echo "foo $1" | grep "$1"; }
$ testfun1 C#
foo C#
$ testfun2(){ read a; echo "bar $a" | grep "$a"; }
$ testfun2
C#
bar C#

Edit:

You might try this form without -E:

$ testfun3(){ grep "$1\|$2" test.txt; }
$ testfun3 C++ awk
something about C++
blah awk blah
$ testfun3 C# sed
blah sed blah
the text containing C#
$ testfun3 C# C++
something about C++
the text containing C#


回答3:

Just quote all the grep metacharacters in $1 and $2 before adding them to your grep expression.

Something like this:

quoted1=`echo "$1" | sed -e 's/\([]\.?^${}+*[]\)/\\\\\1/g'`
quoted2=`echo "$2" | sed -e 's/\([]\.?^${}+*[]\)/\\\\\1/g'`
grep -E "$quoted1\|$quoted2" test.txt

ought to work. Adjust the metachar list to suit. Handling | is a little tricky because backslashing makes it special, but since we're already backslashing backslashes I think it's safe.