I'm trying to build a Python daemon that launches other fully independent processes.
The general idea is for a given shell command, poll every few seconds and ensure that exactly k instances of the command are running. We keep a directory of pidfiles, and when we poll we remove pidfiles whose pids are no longer running and start up (and make pidfiles for) however many processes we need to get to k of them.
The child processes also need to be fully independent, so that if the parent process dies the children won't be killed. From what I've read, it seems there is no way to do this with the subprocess
module. To this end, I used the snippet mentioned here:
http://code.activestate.com/recipes/66012-fork-a-daemon-process-on-unix/
I made a couple necessary modifications (you'll see the lines commented out in the attached snippet):
- The original parent process can't exit because we need the launcher daemon to persist indefinitely.
- The child processes need to start with the same cwd as the parent.
Here's my spawn fn and a test:
import os
import sys
import subprocess
import time
def spawn(cmd, child_cwd):
"""
do the UNIX double-fork magic, see Stevens' "Advanced
Programming in the UNIX Environment" for details (ISBN 0201563177)
http://www.erlenstar.demon.co.uk/unix/faq_2.html#SEC16
"""
try:
pid = os.fork()
if pid > 0:
# exit first parent
#sys.exit(0) # parent daemon needs to stay alive to launch more in the future
return
except OSError, e:
sys.stderr.write("fork #1 failed: %d (%s)\n" % (e.errno, e.strerror))
sys.exit(1)
# decouple from parent environment
#os.chdir("/") # we want the children processes to
os.setsid()
os.umask(0)
# do second fork
try:
pid = os.fork()
if pid > 0:
# exit from second parent
sys.exit(0)
except OSError, e:
sys.stderr.write("fork #2 failed: %d (%s)\n" % (e.errno, e.strerror))
sys.exit(1)
# redirect standard file descriptors
sys.stdout.flush()
sys.stderr.flush()
si = file('/dev/null', 'r')
so = file('/dev/null', 'a+')
se = file('/dev/null', 'a+', 0)
os.dup2(si.fileno(), sys.stdin.fileno())
os.dup2(so.fileno(), sys.stdout.fileno())
os.dup2(se.fileno(), sys.stderr.fileno())
pid = subprocess.Popen(cmd, cwd=child_cwd, shell=True).pid
# write pidfile
with open('pids/%s.pid' % pid, 'w') as f: f.write(str(pid))
sys.exit(1)
def mkdir_if_none(path):
if not os.access(path, os.R_OK):
os.mkdir(path)
if __name__ == '__main__':
try:
cmd = sys.argv[1]
num = int(sys.argv[2])
except:
print 'Usage: %s <cmd> <num procs>' % __file__
sys.exit(1)
mkdir_if_none('pids')
mkdir_if_none('test_cwd')
for i in xrange(num):
print 'spawning %d...'%i
spawn(cmd, 'test_cwd')
time.sleep(0.01) # give the system some breathing room
In this situation, things seem to work fine, and the child processes persist even when the parent is killed. However, I'm still running into a spawn limit on the original parent. After ~650 spawns (not concurrently, the children have finished) the parent process chokes with the error:
spawning 650...
fork #2 failed: 35 (Resource temporarily unavailable)
Is there any way to rewrite my spawn function so that I can spawn these independent child processes indefinitely? Thanks!