Concurrent asynchronous processes with Python, Fla

2019-01-22 23:11发布

I am working on a small but computationally-intensive Python app. The computationally-intensive work can be broken into several pieces that can be executed concurrently. I am trying to identify a suitable stack to accomplish this.

Currently I am planning to use a Flask app on Apache2+WSGI with Celery for the task queue.

In the following, will a_long_process(), another_long_process() and yet_another_long_process() execute concurrently if there are 3 or more workers available? Will the Flask app be blocked while the processes are executing?

from the Flask app:

@myapp.route('/foo')
def bar():
    task_1 = a_long_process.delay(x, y)
    task_1_result = task_1.get(timeout=1)
    task_2 = another_long_process.delay(x, y)
    task_2_result = task_2.get(timeout=1)
    task_3 = yet_another_long_process.delay(x, y)
    task_3_result = task_3.get(timeout=1)
    return task_1 + task_2 + task_3

tasks.py:

from celery import Celery
celery = Celery('tasks', broker="amqp://guest@localhost//", backend="amqp://")
@celery.task
def a_long_process(x, y):
    return something
@celery.task
def another_long_process(x, y):
    return something_else
@celery.task
def yet_another_long_process(x, y):
    return a_third_thing

3条回答
Anthone
2楼-- · 2019-01-22 23:49

According to the documentation for result.get(), it waits until the result is ready before returning, so normally it is in fact blocking. However, since you have timeout=1, the call to get() will raise a TimeoutError if the task takes longer than 1 second to complete.

By default, Celery workers start with a concurrency level set equal to the number of CPUs available. The concurrency level seems to determine the number of threads that can be used to process tasks. So, with a concurrency level >= 3, it seems like the Celery worker should be able to process that many tasks concurrently (perhaps someone with greater Celery expertise can verify this?).

查看更多
Luminary・发光体
3楼-- · 2019-01-22 23:52

You should change your code so the workers can work in parallel:

@myapp.route('/foo')
def bar():
    # start tasks
    task_1 = a_long_process.delay(x, y)
    task_2 = another_long_process.delay(x, y)
    task_3 = yet_another_long_process.delay(x, y)
    # fetch results
    try:
        task_1_result = task_1.get(timeout=1)
        task_2_result = task_2.get(timeout=1)
        task_3_result = task_3.get(timeout=1)
    except TimeoutError:
        # Handle this or don't specify a timeout.
        raise
    # combine results
    return task_1 + task_2 + task_3

This code will block until all results are available (or the timeout is reached).

Will the Flask app be blocked while the processes are executing?

This code will only block one worker of your WSGI container. Wether the entire site is unresponsive depends on the WSGI container you are using. (e.g. Apache + mod_wsgi, uWSGI, gunicorn, etc.) Most WSGI containers spawn multiple workers so only one worker will be blocked while your code waits for the task results.

For this kind of application I would recommend using gevent which spawns a separate greenlet for every request and is very lightweight.

查看更多
做自己的国王
4楼-- · 2019-01-23 00:03

Use the Group feature of celery canvas:

The group primitive is a signature that takes a list of tasks that should be applied in parallel.

Here is the example provided in the documentation:

from celery import group
from proj.tasks import add

g = group(add.s(2, 2), add.s(4, 4))
res = g()
res.get()

Which outputs [4, 8].

查看更多
登录 后发表回答