Escape dollar sign in regexp for sed

2020-03-14 07:13发布

I will introduce what my question is about before actually asking - feel free to skip this section!

Some background info about my setup

To update files manually in a software system, I am creating a bash script to remove all files that are not present in the new version, using diff:

for i in $(diff -r old new 2>/dev/null | grep "Only in old" | cut -d "/" -f 3- | sed "s/: /\//g"); do echo "rm -f $i" >> REMOVEOLDFILES.sh; done

This works fine. However, apparently my files often have a dollar sign ($) in the filename, this is due to some permutations of the GWT framework. Here is one example line from the above created bash script:

rm -f var/lib/tomcat7/webapps/ROOT/WEB-INF/classes/ExampleFile$3$1$1$1$2$1$1.class

Executing this script would not remove the wanted files, because bash reads these as argument variables. Hence I have to escape the dollar signs with "\$".

My actual question

I now want to add a sed-Command in the aforementioned pipeline, replacing this dollar sign. As a matter of fact, sed also reads the dollar sign as special character for regular expressions, so obviously I have to escape it as well. But somehow this doesn't work and I could not find an explanation after googling a lot.

Here are some variations I have tried:

echo "Bla$bla" | sed "s/\$/2/g"        # Output: Bla2
echo "Bla$bla" | sed 's/$$/2/g'        # Output: Bla
echo "Bla$bla" | sed 's/\\$/2/g'       # Output: Bla
echo "Bla$bla" | sed 's/@"\$"/2/g'     # Output: Bla
echo "Bla$bla" | sed 's/\\\$/2/g'      # Output: Bla

The desired output in this example should be "Bla2bla". What am I missing? I am using GNU sed 4.2.2

EDIT

I just realized, that the above example is wrong to begin with - the echo command already interprets the $ as a variable and the following sed doesn't get it anyway... Here a proper example:

  1. Create a textfile test with the content bla$bla
  2. cat test gives bla$bla
  3. cat test | sed "s/$/2/g" gives bla$bla2
  4. cat test | sed "s/\$/2/g" gives bla$bla2
  5. cat test | sed "s/\\$/2/g" gives bla2bla

Hence, the last version is the answer. Remember: when testing, first make sure your test is correct, before you question the test object........

3条回答
一纸荒年 Trace。
2楼-- · 2020-03-14 07:59

The correct way to escape a dollar sign in regular expressions for sed is double-backslash. Then, for creating the escaped version in the output, we need some additional slashes:

cat filenames.txt | sed "s/\\$/\\\\$/g" > escaped-filenames.txt

Yep, that's four backslashes in a row. This creates the required changes: a filename like bla$1$2.class would then change to bla\$1\$2.class. This I can then insert into the full pipeline:

for i in $(diff -r old new 2>/dev/null | grep "Only in old" | cut -d "/" -f 3- | sed "s/: /\//g" | sed "s/\\$/\\\\$/g"; do echo "rm -f $i" >> REMOVEOLDFILES.sh; done

Alternative to solve the background problem

chepner posted an alternative to solve the backround problem by simply adding single-quotes around the filenames for the output. This way, the $-signs are not read as variables by bash when executing the script and the files are also properly removed:

for i in $(diff -r old new 2>/dev/null | grep "Only in old" | cut -d "/" -f 3- | sed "s/: /\//g"); do echo "rm -f '$i'" >> REMOVEOLDFILES.sh; done

(note the changed echo "rm -f '$i'" in that line)

查看更多
虎瘦雄心在
3楼-- · 2020-03-14 08:13

There is already a nice answer directly in the edited question that helped me a lot - thank you!

I just want to add a bit of curious behavior that I stumbled across: matching against a dollar sign at the end of lines (e.g. when modifying PS1 in your .bashrc file). As a workaround, I match for additional whitespace.

$ DOLLAR_TERMINATED="123456 $"
$ echo "${DOLLAR_TERMINATED}" | sed -e "s/ \\$/END/"
123456END
$ echo "${DOLLAR_TERMINATED}" | sed -e "s/ \\$$/END/"
sed: -e expression #1, char 13: Invalid back reference
$ echo "${DOLLAR_TERMINATED}" | sed -e "s/ \\$\s*$/END/"
123456END

Explanation to the above, line by line:

  • Defining DOLLAR_TERMINATED - I want to replace the dollar sign at the end of DOLLAR_TERMINATED with "END"
  • It works if I don't check for the line ending
  • It won't work if I match for the line ending as well (adding one more $ on the left side)
  • It works if I additionally match for (non-present) whitespace

(My sed version is 4.2.2 from February 2016, bash is version 4.3.48(1)-release (x86_64-pc-linux-gnu), in case that makes any difference)

查看更多
等我变得足够好
4楼-- · 2020-03-14 08:17

There are other problems with your script, but file names containing $ are not a problem if you properly quote the argument to rm in the resulting script.

echo "rm -f '$i'" >> REMOVEOLDFILES.sh

or using printf, which makes quoting a little nicer and is more portable:

printf "rm -f '%s'" "$i" >> REMOVEOLDFILES.sh

(Note that I'm addressing the real problem, not necessarily the question you asked.)

查看更多
登录 后发表回答