grep: matching on literal “+”

2019-07-21 10:57发布

问题:

I need to find occurrences of "(+)" in my sql scripts, (i.e., Oracle outer join expressions). Realizing that "+", "(", and ")" are all special regex characters, I tried:

grep "\(\+\)" *

Now this does return occurrences of "(+)", but other lines as well. (Seemingly anything with open and close parens on the same line.) Recalling that parens are only special for extended grep, I tried:

grep "(\+)" *
grep "(\\+)" *

Both of these returned only lines that contain "()". So assuming that "+" can't be escaped, I tried an old trick:

grep "([+])" *

That works. I cross-checked the result with a non-regex tool.

Question: Can someone explain what exactly is going on with the "+" character? Is there a less kludgy way to match on "(+)"?

(I am using the cygwin grep command.)

EDIT: Thanks for the solutions. -- And now I see that, per the GNU grep manual that Bruno referenced, "\+" when used in a basic expression gives "+" its extended meaning, and therefore matches one-or-more "("s followed by a ")". And in my files that's always "()".

回答1:

GNU grep (which is included in Cygwin) supports two syntaxes for regular expressions: basic and extended. grep uses basic regular expressions and egrep or grep -E uses extended regular expressions. The basic difference, from the grep manual, is the following:

In basic regular expressions the metacharacters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions \?, \+, \{, \|, \(, and \).

Since you want the ordinary meaning of the characters ( + ), either of the following two forms should work for your purpose:

grep "(+)" *       # Basic
egrep "\(\+\)" *   # Extended


回答2:

You probably need to add some backslashes because the shell is swallowing them.

ETA: Actually, I just tried on my Cygwin and grep "(+)" seems to work just fine for what you want.