C++ program crashes with EXIT CODE: 9 (SIGKILL)

2020-08-09 09:11发布

问题:

My application program crashes with EXIT CODE: 9 (SIGKILL)

I never run any command such as 'kill -9 (pid)' or 'pkill (process name)' that can kill the running process.

Where should I start for debugging in this case?

  1. I tried to dump the stack trace when the program crashes, but I found that the SIGKILL cannot be caught for error handling.

  2. The program uses MPI and runs in cluster environments. It dies after around 1 hour of its run.

Is there any COMMON causes that can incur SIGKILL exception?

(It's running on linux; cent os 7)

回答1:

@ I answer my own question so that some one can get helps later.

The exception was caused by OutOfMemory.

The process allocates too much memory putting pressures on OS. The OS has a hit man, oom-killer, that kills such processes for the sake of system stability. The oom-killer uses bullets called SIGKILL.

However, since SIGKILL is invisible (it cannot be caught and handled by the application), for some newbies including me, it is not always easy to figure out the true reason for the crash.

One good news is that when the hit man kills your process, it always logs its action at /var/log/messages.

Depending on your OS configuration, oom-killer might not log any message at all. In such a case, you can configure it as well. Search for rsyslog configuration in google.

Finding which process was killed by Linux OOM killer