How to find the reason for a dead process without

2019-03-16 10:32发布

This is an interview question.

A developer started a process. But when a customer wants to use the process, he found the process wasn't running. The developer logged in and found the process died. How can the developer know what was wrong?

Follow up: a running process which is supposed to write logs to a file. But there are no logs in the file. How can the developer figure out what's going on in the process?

I think : If the program can be re-run, i will use gdb to track the process. If not, check the output file from the process (the application program). or, add print to the code.

But, are there other ways to do it by referring some information generated by OS?

4条回答
2楼-- · 2019-03-16 11:12

If you have the disk space and spare CPU power, you can leave strace following the program to catch the sequence leading up to exit.

One possible cause if the program died without leaving any trace is the Out-Of-Memory (OOM) killer. This will leave a message in the kernel log if it kills your process.

From the same answer, process accounting can be modified to provide some clues by telling you the exit code along with the exit time.

查看更多
干净又极端
3楼-- · 2019-03-16 11:26

... use a debugger like gdb ...

查看更多
forever°为你锁心
4楼-- · 2019-03-16 11:30

Sometimes programs don't create core dumps. In this case knowing the exit code of your software may help.

So you can use this script below to start your software and log its exit status for finding its exit reason.

Example :

#!/bin/bash
./myprogram

#get exit code
exitvalue=$?

#log exit code value to /var/log/messages
logger -s "exit code of my program is " $exitvalue
查看更多
做个烂人
5楼-- · 2019-03-16 11:31

are there other ways to do it by referring some information generated by OS?

core dump is one option.

查看更多
登录 后发表回答