I've been pulling my hair out on this one for several hours now. I welcome any new ideas on where to look next.
The objective is to login to a custom application CLI over SSH and then drop down a debug shell on the far-end device using one of the custom CLI commands. On the client side I'm using CentOS minimal and running ssh as follows:
Working case:
[user@ashleys-xpvm ws]$ ssh -p8222 admin@192.168.56.20
admin@192.168.56.20's password:
Welcome to CLI
admin connected from 172.29.33.108 using ssh on scm2
TRAN39# debug-utils shell
device@scm2:~$
The ssh client session accesses the custom CLI using the application-specific port 8222. Once inside the CLI, we drop down to the bash shell using the 'debug-utils shell' command.
This sequence was scripted with Python/pexpect and that worked fine when the script was launched from the user's command line. The problem arose when the script was moved to the crontab to be run automatically by crond. In the latter case, the script fails in a peculiar way.
Following the recommendation from this post: How to simulate the environment cron executes a script with? I launched a new shell on the client machine with the same environment variables as what the cron job uses and I was able to manually reproduce the same problem that the automatic cron job was running into.
With the cron environment set, the far-end device now throws the following error at the point where we issue the command to drop into the device's bash shell:
sh-4.2$ ssh -p8222 admin@192.168.56.20
admin@192.168.56.20's password:
Welcome to CLI
admin connected from 172.29.33.108 using ssh on scm2
TRAN39# debug-utils shell
error: failed to decode arguments
TRAN39#
Once I had the problem reproduced, I setup two terminals, one with the working environment variables and the other with the failing environment variables. I ran ssh from both terminals with '-vvv' flag and compared the debug output between the two.
The two outputs were identical except for where they step through the environment variables to determine what to send to the send the SSH server (obviously), as well as the 'bits set' lines were slightly different. I looked at the environment variable lines and I could see that ssh is ignoring all of them except for LANG which is identical in both the working case and the failing case.
I'm at a loss now for why the ssh server at the far-end device is behaving differently between these two client-side environment settings.
Here is the working environment:
[user@centos_vm ws]$ env
XDG_SESSION_ID=294
HOSTNAME=centos_vm
SELINUX_ROLE_REQUESTED=
TERM=xterm-256color
SHELL=/bin/bash
HISTSIZE=1000
SSH_CLIENT=192.168.56.20 52795 22
SELINUX_USE_CURRENT_RANGE=
OLDPWD=/home/user
SSH_TTY=/dev/pts/4
USER=user
LS_COLORS=rs=0:di=38;5;27:ln=38;5;51:mh=44;38;5;15:pi=40;38;5;11:so=38;5;13:do=38;5;5:bd=48;5;232;38;5;11:cd=48;5;232;38;5;3:or=48;5;232;38;5;9:mi=05;48;5;232;38;5;15:su=48;5;196;38;5;15:sg=48;5;11;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;21;38;5;15:ex=38;5;34:*.tar=38;5;9:*.tgz=38;5;9:*.arc=38;5;9:*.arj=38;5;9:*.taz=38;5;9:*.lha=38;5;9:*.lz4=38;5;9:*.lzh=38;5;9:*.lzma=38;5;9:*.tlz=38;5;9:*.txz=38;5;9:*.tzo=38;5;9:*.t7z=38;5;9:*.zip=38;5;9:*.z=38;5;9:*.Z=38;5;9:*.dz=38;5;9:*.gz=38;5;9:*.lrz=38;5;9:*.lz=38;5;9:*.lzo=38;5;9:*.xz=38;5;9:*.bz2=38;5;9:*.bz=38;5;9:*.tbz=38;5;9:*.tbz2=38;5;9:*.tz=38;5;9:*.deb=38;5;9:*.rpm=38;5;9:*.jar=38;5;9:*.war=38;5;9:*.ear=38;5;9:*.sar=38;5;9:*.rar=38;5;9:*.alz=38;5;9:*.ace=38;5;9:*.zoo=38;5;9:*.cpio=38;5;9:*.7z=38;5;9:*.rz=38;5;9:*.cab=38;5;9:*.jpg=38;5;13:*.jpeg=38;5;13:*.gif=38;5;13:*.bmp=38;5;13:*.pbm=38;5;13:*.pgm=38;5;13:*.ppm=38;5;13:*.tga=38;5;13:*.xbm=38;5;13:*.xpm=38;5;13:*.tif=38;5;13:*.tiff=38;5;13:*.png=38;5;13:*.svg=38;5;13:*.svgz=38;5;13:*.mng=38;5;13:*.pcx=38;5;13:*.mov=38;5;13:*.mpg=38;5;13:*.mpeg=38;5;13:*.m2v=38;5;13:*.mkv=38;5;13:*.webm=38;5;13:*.ogm=38;5;13:*.mp4=38;5;13:*.m4v=38;5;13:*.mp4v=38;5;13:*.vob=38;5;13:*.qt=38;5;13:*.nuv=38;5;13:*.wmv=38;5;13:*.asf=38;5;13:*.rm=38;5;13:*.rmvb=38;5;13:*.flc=38;5;13:*.avi=38;5;13:*.fli=38;5;13:*.flv=38;5;13:*.gl=38;5;13:*.dl=38;5;13:*.xcf=38;5;13:*.xwd=38;5;13:*.yuv=38;5;13:*.cgm=38;5;13:*.emf=38;5;13:*.axv=38;5;13:*.anx=38;5;13:*.ogv=38;5;13:*.ogx=38;5;13:*.aac=38;5;45:*.au=38;5;45:*.flac=38;5;45:*.mid=38;5;45:*.midi=38;5;45:*.mka=38;5;45:*.mp3=38;5;45:*.mpc=38;5;45:*.ogg=38;5;45:*.ra=38;5;45:*.wav=38;5;45:*.axa=38;5;45:*.oga=38;5;45:*.spx=38;5;45:*.xspf=38;5;45:
MAIL=/var/spool/mail/user
PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/user/.local/bin:/home/user/bin
PWD=/home/user/ws
LANG=en_US.UTF-8
SELINUX_LEVEL_REQUESTED=
HISTCONTROL=ignoredups
SHLVL=1
HOME=/home/user
LOGNAME=user
SSH_CONNECTION=192.168.56.20 52795 192.168.56.101 22
LESSOPEN=||/usr/bin/lesspipe.sh %s
XDG_RUNTIME_DIR=/run/user/1000
_=/usr/bin/env
[user@centos_vm ws]$
...and here is the failing (i.e. cron) environment:
sh-4.2$ env
XDG_SESSION_ID=321
SHELL=/bin/sh
USER=user
PATH=/usr/bin:/bin
PWD=/home/user/ws
LANG=en_US.UTF-8
HOME=/home/user
SHLVL=2
LOGNAME=user
XDG_RUNTIME_DIR=/run/user/1000
_=/usr/bin/env
OLDPWD=/home/user
sh-4.2$
I'm running out of my depth on ssh debugging at this point so any guidance on where to look next is greatly appreciated.
Usually
ssh
without specifying a command (ssh user@host
) would pass the value ofTERM
on local host to remote server. For example:In crontab,
crond
by default will not set theTERM
var so after ssh login, theTERM
will be set todumb
(which is not fully functional). See example:In your case it sounds like the remote application requires a more functional
TERM
so explicitly setting it toTERM=xterm
(which will be passed to the remote server) in crontab would fix it.Note that
ssh
with a command (ssh user@host command...
) will not allocate a pty on remote server so the localTERM
will not be passed. To force creating a pty and passing the var we must usessh -t
. See example:Found Dumb terminals on Wikipedia: