Using Popen in a thread blocks every incoming Flas

2019-02-08 08:04发布

问题:

I have the following situation: I receive a request on a socketio server. I answer it (socket.emit(..)) and then start something with heavy computation load in another thread.

If the heavy computation is caused by subprocess.Popen (using subprocess.PIPE) it totally blocks every incoming request as long as it is being executed although it happens in a separate thread.

No problem - in this thread it was suggested to asynchronously read the result of the subprocess with a buffer size of 1 so that between these reads other threads have the chance to do something. Unfortunately this did not help for me.

I also already monkeypatched eventlet and that works fine - as long as I don't use subprocess.Popen with subprocess.PIPE in the thread.

In this code sample you can see that it only happens using subprocess.Popen with subprocess.PIPE. When uncommenting #functionWithSimulatedHeavyLoad() and instead comment functionWithHeavyLoad() everything works like charm.

from flask import Flask
from flask.ext.socketio import SocketIO, emit
import eventlet

eventlet.monkey_patch()
app = Flask(__name__)
socketio = SocketIO(app)

import time
from threading  import Thread

@socketio.on('client command')
def response(data, type = None, nonce = None):
    socketio.emit('client response', ['foo'])
    thread = Thread(target = testThreadFunction)
    thread.daemon = True
    thread.start()

def testThreadFunction():
    #functionWithSimulatedHeavyLoad()
    functionWithHeavyLoad()

def functionWithSimulatedHeavyLoad():
    time.sleep(5)

def functionWithHeavyLoad():
    from datetime import datetime
    import subprocess
    import sys
    from queue import Queue, Empty

    ON_POSIX = 'posix' in sys.builtin_module_names

    def enqueueOutput(out, queue):
        for line in iter(out.readline, b''):
            if line == '':
                break
            queue.put(line)
        out.close()

    # just anything that takes long to be computed
    shellCommand = 'find / test'

    p = subprocess.Popen(shellCommand, universal_newlines=True, shell=True, stdout=subprocess.PIPE, bufsize=1, close_fds=ON_POSIX)
    q = Queue()
    t = Thread(target = enqueueOutput, args = (p.stdout, q))
    t.daemon = True
    t.start()
    t.join()

    text = ''

    while True:
        try:
            line = q.get_nowait()
            text += line
            print(line)
        except Empty:
            break

    socketio.emit('client response', {'text': text})

socketio.run(app)

The client receives the message 'foo' after the blocking work in the functionWithHeavyLoad() function is completed. It should receive the message earlier, though.

This sample can be copied and pasted in a .py file and the behavior can be instantly reproduced.

I am using Python 3.4.3, Flask 0.10.1, flask-socketio1.2, eventlet 0.17.4

Update

If I put this into the functionWithHeavyLoad function it actually works and everything's fine:

import shlex
shellCommand = shlex.split('find / test')

popen = subprocess.Popen(shellCommand, stdout=subprocess.PIPE)

lines_iterator = iter(popen.stdout.readline, b"")
for line in lines_iterator:
    print(line)
    eventlet.sleep()

The problem is: I used find for heavy load in order to make the sample for you more easily reproducable. However, in my code I actually use tesseract "{0}" stdout -l deu as the sell command. This (unlike find) still blocks everything. Is this rather a tesseract issue than eventlet? But still: how can this block if it happens in a separate thread where it reads line by line with context switch when find does not block?

回答1:

Thanks to this question I learned something new today. Eventlet does offer a greenlet friendly version of subprocess and its functions, but for some odd reason it does not monkey patch this module in the standard library.

Link to the eventlet implementation of subprocess: https://github.com/eventlet/eventlet/blob/master/eventlet/green/subprocess.py

Looking at the eventlet patcher, the modules that are patched are os, select, socket, thread, time, MySQLdb, builtins and psycopg2. There is absolutely no reference to subprocess in the patcher.

The good news is that I was able to work with Popen() in an application very similar to yours, after I replaced:

import subprocess

with:

from eventlet.green import subprocess

But note that the currently released version of eventlet (0.17.4) does not support the universal_newlines option in Popen, you will get an error if you use it. Support for this option is in master (here is the commit that added the option). You will either have to remove that option from your call, or else install the master branch of eventlet direct from github.