I have a Python program (precisely, a Django application) that starts a subprocess using subprocess.Popen
. Due to architecture constraints of my application, I'm not able to use Popen.terminate()
to terminate the subprocess and Popen.poll()
to check when the process has terminated. This is because I cannot hold a reference to the started subprocess in a variable.
Instead, I have to write the process id pid
to a file pidfile
when the subprocess starts. When I want to stop the subprocess, I open this pidfile
and use os.kill(pid, signal.SIGTERM)
to stop it.
My question is: How can I find out when the subprocess has really terminated? Using signal.SIGTERM
it needs approximately 1-2 minutes to finally terminate after calling os.kill()
. First I thought that os.waitpid()
would be the right thing for this task but when I call it after os.kill()
it gives me OSError: [Errno 10] No child processes
.
By the way, I'm starting and stopping the subprocess from a HTML template using two forms and the program logic is inside a Django view. The exception gets displayed in my browser when my application is in debug mode. It's probably also important to know that the subprocess that I call in my view (python manage.py crawlwebpages
) itself calls another subprocess, namely an instance of a Scrapy crawler. I write the pid
of this Scrapy instance to the pidfile
and this is what I want to terminate.
Here is the relevant code:
def process_main_page_forms(request):
if request.method == 'POST':
if request.POST['form-type'] == u'webpage-crawler-form':
template_context = _crawl_webpage(request)
elif request.POST['form-type'] == u'stop-crawler-form':
template_context = _stop_crawler(request)
else:
template_context = {
'webpage_crawler_form': WebPageCrawlerForm(),
'stop_crawler_form': StopCrawlerForm()}
return render(request, 'main.html', template_context)
def _crawl_webpage(request):
webpage_crawler_form = WebPageCrawlerForm(request.POST)
if webpage_crawler_form.is_valid():
url_to_crawl = webpage_crawler_form.cleaned_data['url_to_crawl']
maximum_pages_to_crawl = webpage_crawler_form.cleaned_data['maximum_pages_to_crawl']
program = 'python manage.py crawlwebpages' + ' -n ' + str(maximum_pages_to_crawl) + ' ' + url_to_crawl
p = subprocess.Popen(program.split())
template_context = {
'webpage_crawler_form': webpage_crawler_form,
'stop_crawler_form': StopCrawlerForm()}
return template_context
def _stop_crawler(request):
stop_crawler_form = StopCrawlerForm(request.POST)
if stop_crawler_form.is_valid():
with open('scrapy_crawler_process.pid', 'rb') as pidfile:
process_id = int(pidfile.read().strip())
print 'PROCESS ID:', process_id
os.kill(process_id, signal.SIGTERM)
os.waitpid(process_id, os.WNOHANG) # This gives me the OSError
print 'Crawler process terminated!'
template_context = {
'webpage_crawler_form': WebPageCrawlerForm(),
'stop_crawler_form': stop_crawler_form}
return template_context
What can I do? Thank you very much!
EDIT:
According to the great answer given by Jacek Konieczny, I could solve my problem by changing my code in the function _stop_crawler(request)
to the following:
def _stop_crawler(request):
stop_crawler_form = StopCrawlerForm(request.POST)
if stop_crawler_form.is_valid():
with open('scrapy_crawler_process.pid', 'rb') as pidfile:
process_id = int(pidfile.read().strip())
# These are the essential lines
os.kill(process_id, signal.SIGTERM)
while True:
try:
time.sleep(10)
os.kill(process_id, 0)
except OSError:
break
print 'Crawler process terminated!'
template_context = {
'webpage_crawler_form': WebPageCrawlerForm(),
'stop_crawler_form': stop_crawler_form}
return template_context