I am using celery to process some tasks. I can see how many are active or scheduled etc, but I am not able to find any way to see the tasks that have failed. Flower does show me the status but only if it was running when the task was started and failed. Is there any command to get all the tasks that have failed (STATUS: FAILURE) ?
I do have the task id when the task was created. But there are millions of them. So I can't check one by one even if there is a way to check it by task ID. But if there is such a command, please let me know.
Celery doesn't make it easy to find a failed task but Flower (the main Celery management web app) does simplify this. It keeps a record of task IDs even after they are completed, and has an API to let you find only failed tasks.
Flower's rather basic HTTP API includes the /api/tasks
endpoint - you can use /api/tasks?state=FAILURE
to show only failed tasks, then parse the JSON to extract what you need. The contents is similar to what you get in the web API, and it's easy to prototype with curl
and format/filter with jq:
curl -s 'http://localhost:5555/api/tasks?state=FAILURE&limit=5' | jq . | less
Flower needs to be installed and running of course.
Since you have millions of completed tasks, you may need to capture failed task info in a data store for efficient access - perhaps Flower will help. Or you could try a custom on-failure handler in Celery, to capture just failed task info - see this answer.
task id
has state
and status
properties. So you can get the status of tasks by id.
my_task_id = my_task.delay(foo)
my_task_id.state
my_task_id.status
gives the status whether it is PENDING, STARTED, RETRY, FAILURE or SUCCESS.
afaik, celery show only active, scheduled, reserved, revoked but id doesn't show failed tasks.
Since you have all task id's, you can just loop over their status.
for task_id in task_id_list:
if task_id.state == 'FAILURE'
print(task_id)