Replacing “#”, “$”, “%”, “&”, and “_” with “\\#”,

2019-02-06 08:20发布

问题:

I have a plain text document, which I want to compile inside LaTeX. However, sometimes it has the characters, "#", "$", "%", "&", and "_". To compile properly in LaTeX, I must first replace these characters with "#", "\$", "\%", "\&", and "_". I have used this line in sed:

sed -i 's/\#/\\\#/g' ./file.txt
sed -i 's/\$/\\\$/g' ./file.txt
sed -i 's/\%/\\\%/g' ./file.txt
sed -i 's/\&/\\\&/g' ./file.txt
sed -i 's/\_/\\\_/g' ./file.txt

Is this correct?

Unfortunately, the file is too large to open in any GUI software, so checking if my sed line is correct with a text editor is difficult. I tried searching with grep, but the search does not work as expected (e.g. below, I searched for any lines containing "$"):

grep "\$" file.txt
  • What is the best way to put "\" in front of these characters?
  • How can I use grep to successfully check the lines with the replacements?

回答1:

You can do the replacement with a single call to sed:

sed -i -E 's/([#$%&_\])/\\&/g' file.txt

The & in the replacement text fills in for whichever single character is enclosed in parentheses. Note that since \ is the LaTeX escape character, you'll have to escape it as well in the original file.



回答2:

sed -i 's/\#/\\\#/g' ./file.txt
sed -i 's/\$/\\\$/g' ./file.txt
sed -i 's/\%/\\\%/g' ./file.txt
sed -i 's/\&/\\\&/g' ./file.txt
sed -i 's/\_/\\\_/g' ./file.txt

You don't need the \ on the first (search) string on most of them, just $ (it's a special character, meaning the end of a line; the rest aren't special). And in the replacement, you only need two \\, not three. Also, you could do it all in one with several -e statements:

sed -i.bak -e 's/#/\\#/g'  \
           -e 's/\$/\\$/g' \
           -e 's/%/\\%/g'  \
           -e 's/&/\\&/g'  \
           -e 's/_/\\_/g' file.txt

You don't need to double-escape anything (except the \\) because these are single-quoted. In your grep, bash is interpreting the escape on the $ because it's a special character (specifically, a sigil for variables), so grep is getting and searching for just the $, which is a special character meaning the end of a line. You need to either single-quote it to prevent bash from interpreting the \ ('\$', or add another pair of \\: "\\\$". Presumably, that's where you're getting the\` from, but you don't need it in the sed as it's written.



回答3:

I think your problem is that bash itself is handling those escapes.

  1. What you have looks right to me. But warning: it will also doubly escape e.g. a \# that is already escaped. If that's not what you want, you might want to modify your patterns to check that there isn't a preceding \ already.
  2. $ is used for bash command substitution syntax. I guess grep "\\$" file.txt should do what you expect.


回答4:

I do not respond for sed, the other answers are good enougth ;-)

You can use less as viewer to check your huge file (or more, but less is more comfortable than more).

For searching, you can use fgrep: it ignores regular expression => fgrep '\$' will really search for text \$. fgrep is the same as invoking grep -F.

EDIT: fgrep '\$' and fgrep "\$" are different. In the second case, bash interprets the string and will replace it by a single character: $ (i.e. fgrep will search for $ only).