Python Subprocess Popen Stalling CGI Page

2019-02-24 13:13发布

问题:

I have a tool that I am working on and I need it to run a parser and also output another analysis log. Currently I have it so that it's through a web interface.

  1. User goes to the form and submits a filename for parsing (file already on system).
  2. Form submits information to Python CGI script
  3. Python CGI script runs and spawns a subprocess to run the parsing.
  4. Parser finds appropriate information for analysis and spawns subprocess also.

I am using

import subprocess
...
subprocess.Popen(["./program.py", input])

In my code and I assumed from documentation that we don't wait on the child process to terminate, we just keep running the script. My CGI script that starts all this does:

subprocess.Popen(["./program.py", input])
// HTML generation code
// Javascript to refresh after 1 second to a different page

The HTML generation code is to output just a status that we've processed the request and then the javascript refreshes the page to the main homepage.

The Problem

The CGI page hangs until the subprocesses finish, which is not what I want. I thought Popen doesn't wait for the subprocesses to finish but whenever I run this tool, it stalls until it's complete. I want the script to finish and let the subprocesses run in the background and let the webpages still function properly without the user thinking everything is just stalled with the loading signals.

I can't seem to find any reason why Popen would do this because everywhere I read it says it does not wait, but it seems to.

Something odd also is that the apache logs show: "Request body read timeout" as well before the script completes. Is Apache actually stalling the script then?

Sorry I can't show complete code as it's "confidential" but hopefully the logic is there to be understood.

回答1:

Apache probably waits for the child process to complete. You could try to demonize the child (double fork, setsid) or better just submit the job to a local service e.g., by writing to a predefined file or using some message broker or via higher level interface such as celery



回答2:

Not sure exactly why this works but I followed the answer in this thread: How do I run another script in Python without waiting for it to finish?

To do:

p = subprocess.Popen([sys.executable, '/path/to/script.py'], 
                     stdout=subprocess.PIPE, 
                     stderr=subprocess.STDOUT)

Instead of:

p = subprocess.Popen([sys.executable, '/path/to/script.py'])

And for some reason now the CGI script will terminate and the subprocesses keep running. Any insight as to why there is a difference would be helpful? I don't see why having to define the other two parameters would cause such a stall.