We want to remove ^[
, and all of the escape sequences.
sed is not working and is giving us this error:
$ sed 's/^[//g' oldfile > newfile; mv newfile oldfile;
sed: -e expression #1, char 7: unterminated `s' command
$ sed -i '' -e 's/^[//g' somefile
sed: -e expression #1, char 7: unterminated `s' command
Just a note; let's say you have a file like this (such line endings are generated by
git
remote reports):In binary, this looks like this:
It is visible that
git
here adds the sequence0x1b
0x5b
0x4b
before the line ending (0x0a
).Note that - while you can match the
0x1b
with a literal format\x1b
in sed, you CANNOT do the same for0x5b
, which represents the left square bracket[
:You might think you can escape the representation with an extra backslash
\
- which ends up as\\x5b
; but while that "passes" - it doesn't match anything as intended:So if you want to match this character, apparently you must write it as escaped left square bracket, that is
\[
- the rest of the values can than be entered with escaped\x
notation:I've stumbled upon this post when looking for a way to strip extra formatting from man pages. ansifilter did it, but it was far from desired result (for example all previously-bold characters were duplicated, like
SSYYNNOOPPSSIISS
).For that task the correct command would be
col -bx
, for example:(source)
Are you looking for ansifilter?
Two things you can do: enter the literal escape (in bash:)
Using keyboard entry:
alternatively
Or you can use character escapes:
or for all control characters:
You can remove all non printable characters with this:
commandlinefu gives the correct answer which strips ANSI colours as well as movement commands:
I don't have enough reputation to add a comment to the answer given by Luke H, but I did want to share the regular expression that I've been using to eliminate all of the ASCII Escape Sequences.