I have a watchdog application. It watches my main app which might crash for one reason or another (I know it is bad, but this is not the point).
I programmed this watchdog to accept SIGUSR1 signals to stop monitoring my application presence. I signal it with
kill -SIGUSR1 `pidof myapp`
This works really well. My problem comes when I try to signal an older version of the watchdog which does not have this functionality built in. In this case, the kill signal kills the watchdog (terminates the process), which leads to further complications (rebooting of the device).
Is there a way to signal my watchdog with SIGUSR1 so that it does not terminate if this particular signal is unhandled?
From the GNU docs about signal handling:
The SIGUSR1 and SIGUSR2 signals are set aside for you to use any way you want. They're useful for simple interprocess communication, if you write a signal handler for them in the program that receives the signal.
There is an example showing the use of SIGUSR1 and SIGUSR2 in section Signaling Another Process.
The default action is to terminate the process.
The default action for SIGINFO is to do nothing, so it may be more suitable:
SIGINFO: Information request. In 4.4 BSD and the GNU system, this signal is sent to all the processes in the foreground process group of the controlling terminal when the user types the STATUS character in canonical mode; see section Characters that Cause Signals.
If the process is the leader of the process group, the default action is to print some status information about the system and what the process is doing. Otherwise the default is to do nothing.
SIGHUP is emitted when the controlling terminal is closed, but since most daemons are not attached to a terminal it is not uncommon to use it as "reload":
Daemon programs sometimes use SIGHUP as a signal to restart themselves, the most common reason for this being to re-read a configuration file that has been changed.
BTW, your watchdog could read a config file from time to time in order to know if it should relaunch the process.
My personal favorite for a watchdog is supervisor.
$ supervisorctl start someapp
someapp: started
$ supervisorctl status someapp
someapp RUNNING pid 16583, uptime 19:16:26
$ supervisorctl stop someapp
someapp: stopped
See if kill -l
returns the list of signals on your platform and try some of them, but SIGUSR1 seems like a bad choice.
$ kill -l
1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP
6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL 10) SIGUSR1
11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM
16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP
21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ
26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO 30) SIGPWR
31) SIGSYS 34) SIGRTMIN 35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3
38) SIGRTMIN+4 39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8
43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13
48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12
53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7
58) SIGRTMAX-6 59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2
63) SIGRTMAX-1 64) SIGRTMAX
[UPDATE]
Carpetsmoker comments about differences in behavior between Linux and BSDs:
SIGINFO seems to work different on GNU libc & BSD; on BSD, it works as you describe, but on Linux, it either doesn't exist, or is the same as SIGPWR... The GNU libc manual seems incorrect in this regard (your kill -l output also doesn't show SIGINFO)... I don't know why GNU doesn't support it, because I find it to be very useful... – Carpetsmoker
The default action when receiving a SIGUSR1 is to terminate if the handler is not present. Meaning you can't do what you want with that signal anymore.
Short of updating the watchdog, there is nothing you can do (and I'm assuming that you are unable to differentiate watchdog versions from within the program prior to sending the signal).