I'm writing a Python program that logs terminal interaction (similar to the script program), and I'd like to filter out the VT100 escape sequences before writing to disk. I'd like to use a function like this:
def strip_escapes(buf):
escape_regex = re.compile(???) # <--- this is what I'm looking for
return escape_regex.sub('', buf)
What should go in escape_regex
?
I found the following solution to successfully parse vt100 color codes and remove the non-printable escape sequences. The code snippet found here successfully removed all codes for me when running a telnet session using telnetlib:
VT100 codes are already grouped(mostly) according to similar patterns here:
http://ascii-table.com/ansi-escape-sequences-vt-100.php
I think the simplest approach would be to use some tool like regexbuddy to define a regex for each VT100 codes group.
The combined expression for escape sequences can be something generic like this:
Should be used with
re.I
This incorporates:
\x1b
followed by a character in the range of@
until_
.\x9b
as opposed to\x1b + "["
.However, this will not work for sequences that define key mappings or otherwise included strings wrapped in quotes.