Realtime progress tracking of celery tasks

2019-01-23 09:39发布

问题:

I have a main celery task that starts multiple sub-tasks (thousands) doing multiple actions (same actions per sub-task).

What i want is, from the main celery task to track in real-time for each action, how many are done and how many have failed for each sub-task.

In summary!

  • Main task: receive list of objects, and a list of actions to do for each object.
  • For each object, a sub-task is started to perform the actions for the object.
  • The main task is finished when all the sub-tasks are finished

So i need to know from the main task the real-time progress of the sub-tasks.

The app i am developing is using django/angularJs, and i need to show the real-time progress asynchronously in the front-end.

I am new to celery, and i am confused and don't know how to implement this.

Any help would be appreciated. Thanks in advance.

回答1:

I have done this before, there's too much code to put in here, so please allow me to simply put the outline, as I trust you can take care of the actual implementation and configuration:

Socket.io-based microservice to send real time events to browser

First, Django is synchronous, so it's not easy doing anything real time with it.

So I resorted to a socket.io process. You could say it's a microservice that only listens to a "channel" that was Redis-backed, and sends notifications to a browser client that listens to a given channel.

Celery -> Redis -> Socket.io -> Browser

I made it so each channel is identified with a Celery task ID. So when I fire a celery task from browser, I get the task ID, keep it and start listening to events from socket.io via that channel.

In chronological order it looks like this:

  • Fire off the Celery task, get the ID
  • Keep the ID in your client app, open a socket.io channel to listen for updates
  • The celery task sends messages to Redis, this will trigger socket.io events
  • Socket.io relays the messages to the browser, in real time

Reporting the progress

As for the actual updating of the status of the task, I just make it so that the Celery task, within its code, sends a message on Redis with something like e.g. {'done': 2, 'total_to_be_done': 10} (to represent a task that went through 2 out of 10 steps, a 20% progress, I prefer to send both numbers for better UI/UX)

import redis
redis_pub = redis.StrictRedis()
channel = 'task:<task_id>:progress'
redis_pub.publish(channel, json.dumps({'done': 2, 'total_to_be_done': 10}))

Find documentation for publishing messages on Redis with Python here

AngularJS/Socket.io integration

You can use or at least get some inspiration from a library like angular-socket-io