I have a bunch of Django requests which executes some mathematical computations ( written in C and executed via a Cython module ) which may take an indeterminate amount ( on the order of 1 second ) of time to execute. Also the requests don't need to access the database and are all independent of each other and Django.
Right now everything is synchronous ( using Gunicorn with sync
worker types ) but I'd like to make this asynchronous and nonblocking. In short I'd like to do something:
- Receive the AJAX request
- Allocate task to an available worker ( without blocking the main Django web application )
- Worker executes task in some unknown amount of time
- Django returns the result of the computation (a list of strings) as JSON whenever the task completes
I am very new to asynchronous Django, and so my question is what is the best stack for doing this.
Is this sort of process something a task queue is well suited for? Would anyone recommend Tornado + Celery + RabbitMQ, or perhaps something else?
Thanks in advance!
Celery would be perfect for this.
Since what you're doing is relatively simple (read: you don't need complex rules about how tasks should be routed), you could probably get away with using the Redis backend, which means you don't need to setup/configure RabbitMQ (which, in my experience, is more difficult).
I use Redis with the most a dev build of Celery, and here are the relevant bits of my config:
# Use redis as a queue
BROKER_BACKEND = "kombu.transport.pyredis.Transport"
BROKER_HOST = "localhost"
BROKER_PORT = 6379
BROKER_VHOST = "0"
# Store results in redis
CELERY_RESULT_BACKEND = "redis"
REDIS_HOST = "localhost"
REDIS_PORT = 6379
REDIS_DB = "0"
I'm also using django-celery
, which makes the integration with Django happy.
Comment if you need any more specific advice.
Since you are planning to make it async (presumably using something like gevent), you could also consider making a threaded/forked backend web service for the computational work.
The async frontend server could handle all the light work, get data from databases that are suitable for async (redis or mysql with a special driver), etc. When a computation has to be done, the frontend server can post all input data to the backend server and retrieve the result when the backend server is done computing it.
Since the frontend server is async, it will not block while waiting for the results. The advantage of this as opposed to using celery, is that you can return the result to the client as soon as it becomes available.
client browser <> async frontend server <> backend server for computations