How to Clean Up subprocess.Popen Instances Upon Pr

2019-08-07 08:13发布

问题:

I have a JavaScript application running on a Python / PyQt / QtWebKit foundation which creates subprocess.Popen objects to run external processes.

Popen objects are kept in a dictionary and referenced by an internal identifier so that the JS app can call Popen's methods via a pyqtSlot such as poll() to determine whether the process is still running or kill() to kill a rogue process.

If a process is not running any more, I would like to remove its Popen object from the dictionary for garbage collection.

What would be the recommended approach to cleaning up the dictionary automatically to prevent a memory leak ?

My ideas so far:

  • Call Popen.wait() in a thread per spawned process to perform an automatic cleanup right upon termination.
    PRO: Immediate cleanup, threads probably do not cost much CPU power as they should be sleeping, right ?
    CON: Many threads depending on spawning activity.
  • Use a thread to call Popen.poll() on all existing processes and check returncode if they have terminated and clean up in that case.
    PRO: Just one worker thread for all processes, lower memory usage.
    CON: Periodic polling necessary, higher CPU usage if there are many long-running processes or lots of processed spawned.

Which one would you choose and why ? Or any better solutions ?

回答1:

For a platform-agnostic solution, I'd go with option #2, since the "CON" of high CPU usage can be circumvented with something like...

import time

# Assuming the Popen objects are in the dictionary values
PROCESS_DICT = { ... }

def my_thread_main():
    while 1:
        dead_keys = []
        for k, v in PROCESS_DICT.iteritems():
            v.poll()
            if v.returncode is not None:
                dead_keys.append(k)
        if not dead_keys:
            time.sleep(1)  # Adjust sleep time to taste
            continue
        for k in dead_keys:
            del PROCESS_DICT[k]

...whereby, if no processes died on an iteration, you just sleep for a bit.

So, in effect, your thread would still be sleeping most of the time, and although there's potential latency between a child process dying and its subsequent 'cleanup', it's really not a big deal, and this should scale better than using one thread per process.

There are better platform-dependent solutions, however.

For Windows, you should be able to use the WaitForMultipleObjects function via ctypes as ctypes.windll.kernel32.WaitForMultipleObjects, although you'd have to look into the feasibility.

For OSX and Linux, it's probably easiest to handle the SIGCHLD asynchronously, using the signal module.

A quick n' dirty example...

import os
import time
import signal
import subprocess

# Map child PID to Popen object
SUBPROCESSES = {}

# Define handler
def handle_sigchld(signum, frame):
    pid = os.wait()[0]
    print 'Subprocess PID=%d ended' % pid
    del SUBPROCESSES[pid]

# Handle SIGCHLD
signal.signal(signal.SIGCHLD, handle_sigchld)

# Spawn a couple of subprocesses
p1 = subprocess.Popen(['sleep', '1'])
SUBPROCESSES[p1.pid] = p1
p2 = subprocess.Popen(['sleep', '2'])
SUBPROCESSES[p2.pid] = p2

# Wait for all subprocesses to die
while SUBPROCESSES:
    print 'tick'
    time.sleep(1)

# Done
print 'All subprocesses died'