Flask long routines

2020-02-09 13:11发布

问题:

I have to do some long work in my Flask app. And I want to do it async. Just start working, and then check status from javascript.

I'm trying to do something like:

@app.route('/sync')
def sync():
    p = Process(target=routine, args=('abc',))
    p.start()

    return "Working..."

But this it creates defunct gunicorn workers.

How can it be solved? Should I use something like Celery?

回答1:

There are many options. You can develop your own solution, use Celery or Twisted (I'm sure there are more already-made options out there but those are the most common ones).

Developing your in-house solution isn't difficult. You can use the multiprocessing module of the Python standard library:

  • When a task arrives you insert a row in your database with the task id and status.
  • Then launch a process to perform the work which updates the row status at finish.
  • You can have a view to check if the task is finished, which actually just checks the status in the corresponding.

Of course you have to think where you want to store the result of the computation and what happens with errors.

Going with Celery is also easy. It would look like the following. To define a function to be executed asynchronously:

@celery.task
def mytask(data):

    ... do a lot of work ...

Then instead of calling the task directly, like mytask(data), which would execute it straight away, use the delay method:

result = mytask.delay(mydata)

Finally, you can check if the result is available or not with ready:

result.ready()

However, remember that to use Celery you have to run an external worker process.

I haven't ever taken a look to Twisted so I cannot tell you if it more or less complex than this (but it should be fine to do what you want to do too).

In any case, any of those solutions should work fine with Flask. To check the result it doesn't matter at all if you use Javascript. Just make the view that checks the status return JSON (you can use Flask's jsonify).



回答2:

I would use a message broker such as rabbitmq or activemq. The flask process would add jobs to the message queue and a long running worker process (or pool or worker processes) would take jobs off the queue to complete them. The worker process could update a database to allow the flask server to know the current status of the job and pass this information to the clients.

Using celery seems to be a nice way to do this.