I wrote a linux bash script with tee and grep to log and timestamp the actions I take in my various ssh sessions. It works, but the logged lines are mixed together sometimes and are full of control characters. How can I properly escape control and other characters not visible in the original sessions and log each line separately?
I am learning bash and the linux interface, so any other suggestions to improve the script would be extremely welcome!
Here is my script (used as a wrapper for the ssh command):
#! /bin/bash
logfile=~/logs/ssh.log
desc="sshlog ${@}"
tab="\t"
format_line() {
while IFS= read -r line; do
echo -e "$(date +"%Y-%m-%d %H:%M:%S %z")${tab}${desc}${tab}${line}"
done
}
echo "[START]" | format_line >> ${logfile}
# grep is used to filter out command line output while keeping commands
ssh "$@" | tee >(grep -e '\@.*\:.*\$' --color=never --line-buffered | format_line >> ${logfile})
echo "[END]" | format_line >> ${logfile}
And here is a screenshot of the jarbled output in the log file:
A note on the solution: Tiago's answer took care of the nonprinting characters very well. Unfortunately, I just realized that the jumbling is being caused by backspaces and using the up and down keys for command completion. That is, the characters are being piped to grep as soon as they appear, and not line-by-line. I will have to ask about this in another question.
Update: I figured out a way to (almost always) handle up/down completion, backspace completion, and control characters.
You can remove those characters with:
perl -lpe 's/[^[:print:]]//g'
Not filtered:
perl -e 'for($i=0; $i<=255; $i++){print chr($i);}' | cat -A
^@^A^B^C^D^E^F^G^H^I$
^K^L^M^N^O^P^Q^R^S^T^U^V^W^X^Y^Z^[^\^]^^^_ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~^?M-^@M-^AM-^BM-^CM-^DM-^EM-^FM-^GM-^HM-^IM-^JM-^KM-^LM-^MM-^NM-^OM-^PM-^QM-^RM-^SM-^TM-^UM-^VM-^WM-^XM-^YM-^ZM-^[M-^\M-^]M-^^M-^_M- M-!M-"M-#M-$M-%M-&M-'M-(M-)M-*M-+M-,M--M-.M-/M-0M-1M-2M-3M-4M-5M-6M-7M-8M-9M-:M-;M-<M-=M->M-?M-@M-AM-BM-CM-DM-EM-FM-GM-HM-IM-JM-KM-LM-MM-NM-OM-PM-QM-RM-SM-TM-UM-VM-WM-XM-YM-ZM-[M-\M-]M-^M-_M-`M-aM-bM-cM-dM-eM-fM-gM-hM-iM-jM-kM-lM-mM-nM-oM-pM-qM-rM-sM-tM-uM-vM-wM-xM-yM-zM-{M-|M-}M-~M-^?
Filtered:
perl -e 'for($i=0; $i<=255; $i++){print chr($i);}' | perl -lpe 's/[^[:print:]]//g' | cat -A
$
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~$
Explanation:
I am printing the whole ASCII table with:
perl -e 'for($i=0; $i<=255; $i++){print chr($i);}'
I am identifying non printable chars with:
cat -A
I am filtering non printable chars with:
perl -lpe 's/[^[:print:]]//g'
Edit: It seems to me that you need to remove ANSI color chars:
Example:
perl -MTerm::ANSIColor -e 'print colored("yellow on_magenta","yellow on_magenta"),"\n"'| sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[m|K]//g" | perl -lpe 's/[^[:print:]]//g'
Adapting to your code:
format_line() {
while IFS= read -r line; do
line=$(sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[m|K]//g" <<< "$line")
line=$(perl -lpe 's/[^[:print:]]//g' <<< "$line")
echo -e "$(date +"%Y-%m-%d %H:%M:%S %z")${tab}${desc}${tab}${line}"
done
}
I also edited your grep
command:
ssh "$@" | tee >(grep -Po '(?<=\$).*' --color=never --line-buffered | format_line >> ${logfile})
Below the output of my test:
2014-06-26 10:11:10 +0100 sshlog tiago@localhost [START]
2014-06-26 10:11:15 +0100 sshlog tiago@localhost whoami
2014-06-26 10:11:16 +0100 sshlog tiago@localhost exit
2014-06-26 10:11:16 +0100 sshlog tiago@localhost [END]
While writing your own script is a great learning experience, you can also use script
to record everything printed on your terminal to a file.
The resulting file will still contains the control characters but there are multiple ways to get rid of them as described in How to clean up output of linux 'script' command.