From within a Python GUI (PyGTK) I start a process (using multiprocessing). The process takes a long time (~20 minutes) to finish. When the process is finished I would like to clean it up (extract the results and join the process). How do I know when the process has finished?
My colleague suggested a busy loop within the parent process that checks if the child process has finished. Surely there is a better way.
In Unix, when a process is forked, a signal handler is called from within the parent process when the child process has finished. But I cannot see anything like that in Python. Am I missing something?
How is it that the end of a child process can be observed from within the parent process? (Of course, I do not want to call Process.join() as it would freeze up the GUI interface.)
This question is not limited to multi-processing: I have exactly the same problem with multi-threading.
This answer is really simple! (It just took me days to work it out.)
Combined with PyGTK's idle_add(), you can create an AutoJoiningThread. The total code is borderline trivial:
If you want to do more than just join (such as collecting results) then you can extend the above class to emit signals on completion, as is done in the following example:
The output of the above example will depend on the order the threads are executed, but it will be similar to:
It's not possible to create an AutoJoiningProcess in the same way (because we cannot call idle_add() across two different processes), however we can use an AutoJoiningThread to get what we want:
To demonstrate AutoJoiningProcess here is another example:
The resulting output will be very similar to the example above, except this time we have both the process joining and it's attendant thread joining too:
Unfortunately:
Thus to use this approach, it would be best to only create threads/process from within the mainloop/GUI.
You can use a queue to communicate with child processes. You can stick intermediate results on it, or messages indicating that milestones have been hit (for progress bars) or just a message indicating that the process is ready to be joined. Polling it with empty is easy and fast.
If you really only want to know if it's done, you can watch the exitcode of your process or poll is_alive().
In my efforts to try to find an answer to my own question, I stumbled across PyGTK's idle_add() function. This gives me the following possibility:
This seems an overly complex way to re-create Unix's call-callback-when-child-process-is-done.
This must be an uber-common problem with GUIs in Python. Surely there is a standard pattern to solve this problem?
I think as a part of making python multi-platform, simple things like SIGCHLD must be done yourself. Agreed, this is a little more work when all you want to do is know when the child is done, but it really isn't THAT painful. Consider the following that uses a child process to do the work, two multiprocessing.Event instances, and a thread to check if the child process is done:
EDIT
Joining to all processes and threads created is a good practice because it will help indicate when zombie (never-finishing) processes/threads are being created. I've altered the above code making a ChildChecker class that inherits from threading.Thread. It's sole purpose is to start a job in a separate process, wait for that process to finish, and then notify the GUI when everything is complete. Joining on the ChildChecker will also join the process it is "checking". Now, if the process doesn't join after 5 seconds, the thread will force terminate the process. Enter "y" creates starts a child process running "endlessChildsPlay" that must demonstrate force termination.
have a look at the subprocess module:
http://docs.python.org/library/subprocess.html