I have some tasks [for my RPi] in Python that involve a lot of sleep
ing: do something that takes a second or two or three, then go wait for several minutes or hours.
I want to pass control back to the OS (Linux) in that sleep time. For this, I should daemonise those tasks. One way is by using Python's Standard daemon process library.
But daemons aren't so easy to understand. As per the Rationale paragraph of PEP 3143, a well behaved daemon should do the following.
- Close all open file descriptors.
- Change current working directory.
- Reset the file access creation mask.
- Run in the background.
- Disassociate from process group.
- Ignore terminal I/O signals.
- Disassociate from control terminal.
- Don't reacquire a control terminal.
- Correctly handle the following circumstances:
- Started by System V init process.
- Daemon termination by SIGTERM signal.
- Children generate SIGCLD signal.
For a Linux/Unix novice like me, some of this is hardly an explanation. But I want to know why I do what I do. So what is the rationale behind this rationale?
PEP 3142 took these requirements from Unix Network Programming ('UNP') by the late W. Richard Stevens. The explanation below is quoted or summarised from that book. It's not so easily found online, and it may be illegal to download. So I borrowed it from the library. Pages referred to are in the second edition, Volume 1 (1998). (The PEP refers to the first edition, 1990.)
Close all open file descriptors.
(This 'Howdy World' Python daemon demonstrates this.)
Change current working directory
Why would you want to unmount a filesystem? Two reasons:
1. You want to separate (and be able mount and unmount) directories that can fill up with user data from directories dedicated to the OS.
2. If you start a daemon from, say, a USB-stick, you want to be able to unmount that stick without interfering with the daemon.
Reset the file access creation mask.
Run in the background.
By definition,
Disassociate from process group.
In order to understand this, you need to understand what a process group is, and that means you need to know what
fork
does.What fork does
fork
is the only way (in Unix) to create a new process. (in Linux, there is alsoclone
). Key in understandingfork
is that it returns twice when called (once): once in the calling process (= parent) with the process ID of the newly created process (= child), and once in the child. "All descriptors known by the parent when forking, are shared with the child when fork returns." (UNP p 102). When a process wants to execute another program, it creates a new process by calling fork, which creates a copy of itself. Then one of them (usually the child) calls the new program. (UNP, p 102)Why disassociate from process group
The point is that a session leader may acquire a controlling terminal. A daemon should never do this, it must stay in the background. This is achieved by calling
fork
twice: the parent forks to create a child, the child forks to create a grandchild. Parent and child are terminated, but grandchild remains. But because it's a grandchild, it's not a session leader, and therefor can't acquire a controlling terminal. (Summarised from UNP par 12.4 p 335)The double fork is discussed in more detail here, and in the comments below.
Ignore terminal I/O signals.
Disassociate from control terminal and don't reacquire a control terminal.
By now, the reasons are obvious:
Correctly handle the following circumstances:
Started by System V init process
Daemon termination by SIGTERM signal
Children generate SIGCLD signal
On a final note, when I started to find answers to my question in UNP, it soon struck me I really should read more of it. It's 900+ (!) pages, from 1998 (!) but I believe the concepts and the explanations in UNP stand the test of time, gloriously. Stevens not only knew very well what he was talking about, he also understood what was difficult about it, and made it easier to understand. That's really rare.