Python multiprocessing pool inside daemon process

2019-01-24 21:28发布

问题:

I opened up a question for this problem and did not get a thorough enough answer to solve the issue (most likely due to a lack of rigor in explaining my issues which is what I am attempting to correct): Zombie process in python multiprocessing daemon

I am trying to implement a python daemon that uses a pool of workers to executes commands using Popen. I have borrowed the basic daemon from http://www.jejik.com/articles/2007/02/a_simple_unix_linux_daemon_in_python/

I have only changed the init, daemonize (or equally the start) and stop methods. Here are the changes to the init method:

def __init__(self, pidfile):
#, stdin='/dev/null', stdout='STDOUT', stderr='STDOUT'):
    #self.stdin = stdin
    #self.stdout = stdout
    #self.stderr = stderr
    self.pidfile = pidfile
    self.pool = Pool(processes=4)

I am not setting stdin, stdout and stderr so that I can debug the code with print statements. Also, I have tried moving this pool around to a few places but this is the only place that does not produce exceptions.

Here are the changes to the daemonize method:

def daemonize(self):
    ...

    # redirect standard file descriptors
    #sys.stdout.flush()
    #sys.stderr.flush()
    #si = open(self.stdin, 'r')
    #so = open(self.stdout, 'a+')
    #se = open(self.stderr, 'a+', 0)
    #os.dup2(si.fileno(), sys.stdin.fileno())
    #os.dup2(so.fileno(), sys.stdout.fileno())
    #os.dup2(se.fileno(), sys.stderr.fileno())

    print self.pool

    ...

Same thing, I am not redirecting io so that I can debug. The print here is used so that I can check the pools location.

And the stop method changes:

def stop(self):
    ...

    # Try killing the daemon process
    try:
        print self.pool
        print "closing pool"
        self.pool.close()
        print "joining pool"
        self.pool.join()
        print "set pool to None"
        self.pool = None
        while 1:
            print "kill process"
            os.kill(pid, SIGTERM)

    ...

Here the idea is that I not only need to kill the process but also clean up the pool. The self.pool = None is just a random attempt to solve the issues which didn't work. At first I thought this was a problem with zombie children which was occurring when I had the self.pool.close() and self.pool.join() inside the while loop with the os.kill(pid, SIGTERM). This is before I decided to start looking at the pool location via the print self.pool. After doing this, I believe the pools are not the same when the daemon starts and when it stops. Here is some output:

me@pc:~/pyCode/jobQueue$ sudo ./jobQueue.py start
<multiprocessing.pool.Pool object at 0x1c543d0>
me@pc:~/pyCode/jobQueue$ sudo ./jobQueue.py stop
<multiprocessing.pool.Pool object at 0x1fb7450>
closing pool
joining pool
set pool to None
kill process
kill process
... [ stuck in infinite loop]

The different locations of the objects suggest to me that they are not the same pool and that one of them is probably the zombie?

After CTRL+C, here is what I get from ps aux|grep jobQueue:

root     21161  0.0  0.0  50384  5220 ?        Ss   22:59   0:00 /usr/bin/python ./jobQueue.py start
root     21162  0.0  0.0      0     0 ?        Z    22:59   0:00 [jobQueue.py] <defunct>
me       21320  0.0  0.0   7624   940 pts/0    S+   23:00   0:00 grep --color=auto jobQueue

I have tried moving the self.pool = Pool(processes=4) to a number of different places. If it is moved to the start()' ordaemonize()methods,print self.pool` will throw an exception saying that it is NoneType. In addition, the location seems to change the number of zombie process that will pop up.

Currently, I have not added the functionality to run anything via the workers. My problem seems completely related to setting up the pool of workers correctly. I would appreciate any information that leads to solving this issue or advice about creating a daemon service that uses a pool of workers to execute a series of commands using Popen. Since I haven't gotten that far, I do not know what challenges I face ahead. I am thinking I might just need to write my own pool but if there is a nice trick to make the pool work here, it would be amazing.

回答1:

The solution is to put the self.pool = Pool(process=4) as the last line of the daemonize method. Otherwise the pool ends up getting lost somewhere (perhaps in the forks). Then the pool can be access inside the run method which is overloaded by the application you wish to daemonize. However, the pool cannot be accessed in the stop method and to do so would lead to NoneType exceptions. I believe there is a more elegant solution but this works and it is all I have for now. If I want the stop to fail when the pool is still in action, I will have to add additional functionality to run and some form of message but I am not currently concerned with this.