I'm trying to use the script command to record an interactive shell session so that I can use it to prepare documentation.
according to the man page:
Script places everything in the log file, including linefeeds and
backspaces. This is not what the naive user expects.
I am the naive user (don't usually get a shout out in man pages, this is rather exciting!), and I'd like to process the output so that backspaces, linefeeds and deleted characters and so on are removed.
example, I run a script session:
stew:~> script -f scriptsession.log
Script started, file is scriptsession.log
stew:~> date
Mon Aug 22 15:00:37 EDT 2011
stew:~> #extra chars: that
stew:~> exit
exit
Script done, file is scriptsession.log
then I use cat to read the session log:
stew:~> cat scriptsession.log
Script started on Mon 22 Aug 2011 03:00:35 PM EDT
stew:~> date
Mon Aug 22 15:00:37 EDT 2011
stew:~> #extra chars: that
stew:~> exit
exit
Script done on Mon 22 Aug 2011 03:01:01 PM EDT
but when I use less, I see evidence of the unwanted characters that are invisible using cat:
stew:~> less scriptsession.log
Script started on Mon 22 Aug 2011 03:00:35 PM EDT
stew:~> date
Mon Aug 22 15:00:37 EDT 2011
stew:~> #extra chars: thiESC[ESC[ESC[ESC[Kthat
stew:~> exit
exit
Script done on Mon 22 Aug 2011 03:01:01 PM EDT
scriptsession.log lines 1-8/8 (END)
when I use cat, I understand that it doesn't remove the invisible chars, it just doesn't represent them visibly, like less does--so if I pipe the cat output to a file, it still has the unwanted characters.
the output format I'd like is a copy of what cat displays. thanks!
(apologies if this is a duplicate, searching "unix script output format" returns lots of noise results with respect to the question at hand!)
I solved the problem by running
scriptreplay
in a screen and the dumping the scrollback buffer to a file.The following expect script does this for you.
It has been tested for logfiles with up to 250.000 lines. In the working directory you need your scriptlog, a file called "time" with 10.000.000 times the line "1 10" in it, and the script. I needs the name of your scriptfile as command line argument, like
./name_of_script name_of_scriptlog
.The time file can be generated by
As mentioned by Keith,
col
does part of the job (the control characters).You can further use
ansifilter
to remove any ANSI escape sequences that you don't want: http://www.andre-simon.de/zip/download.html#ansifilterThe script removes interleaving 'ESC [ C' and 'ESC [ K' substrings. Then replaces 'c BS' substrings to nothig, where c stands for any character.
Or you can use the "more" command, which will interpret those characters and display exactly what you typed, received as output, etc, as if you scrolled back in your buffer.
The
col
command will do some, but not all, of the filtering you're looking for. (It doesn't seem to recognize the control sequences for bold and underlining, for example.)An approach I've used in the past is to (a) change my shell prompt so it doesn't do any highlighting (it normally does), and/or (b) set
$TERM
to"dumb"
so various commands won't try to use certain control sequences.