I have a JavaScript application running on a Python / PyQt / QtWebKit foundation which creates subprocess.Popen
objects to run external processes.
Popen
objects are kept in a dictionary and referenced by an internal identifier so that the JS app can call Popen
's methods via a pyqtSlot
such as poll()
to determine whether the process is still running or kill()
to kill a rogue process.
If a process is not running any more, I would like to remove its Popen
object from the dictionary for garbage collection.
What would be the recommended approach to cleaning up the dictionary automatically to prevent a memory leak ?
My ideas so far:
- Call
Popen.wait()
in a thread per spawned process to perform an automatic cleanup right upon termination.
PRO: Immediate cleanup, threads probably do not cost much CPU power as they should be sleeping, right ?
CON: Many threads depending on spawning activity.
- Use a thread to call
Popen.poll()
on all existing processes and check returncode
if they have terminated and clean up in that case.
PRO: Just one worker thread for all processes, lower memory usage.
CON: Periodic polling necessary, higher CPU usage if there are many long-running processes or lots of processed spawned.
Which one would you choose and why ? Or any better solutions ?
For a platform-agnostic solution, I'd go with option #2, since the "CON" of high CPU usage can be circumvented with something like...
import time
# Assuming the Popen objects are in the dictionary values
PROCESS_DICT = { ... }
def my_thread_main():
while 1:
dead_keys = []
for k, v in PROCESS_DICT.iteritems():
v.poll()
if v.returncode is not None:
dead_keys.append(k)
if not dead_keys:
time.sleep(1) # Adjust sleep time to taste
continue
for k in dead_keys:
del PROCESS_DICT[k]
...whereby, if no processes died on an iteration, you just sleep for a bit.
So, in effect, your thread would still be sleeping most of the time, and although there's potential latency between a child process dying and its subsequent 'cleanup', it's really not a big deal, and this should scale better than using one thread per process.
There are better platform-dependent solutions, however.
For Windows, you should be able to use the WaitForMultipleObjects
function via ctypes
as ctypes.windll.kernel32.WaitForMultipleObjects
, although you'd have to look into the feasibility.
For OSX and Linux, it's probably easiest to handle the SIGCHLD
asynchronously, using the signal
module.
A quick n' dirty example...
import os
import time
import signal
import subprocess
# Map child PID to Popen object
SUBPROCESSES = {}
# Define handler
def handle_sigchld(signum, frame):
pid = os.wait()[0]
print 'Subprocess PID=%d ended' % pid
del SUBPROCESSES[pid]
# Handle SIGCHLD
signal.signal(signal.SIGCHLD, handle_sigchld)
# Spawn a couple of subprocesses
p1 = subprocess.Popen(['sleep', '1'])
SUBPROCESSES[p1.pid] = p1
p2 = subprocess.Popen(['sleep', '2'])
SUBPROCESSES[p2.pid] = p2
# Wait for all subprocesses to die
while SUBPROCESSES:
print 'tick'
time.sleep(1)
# Done
print 'All subprocesses died'