Capture Heroku SIGTERM in Celery workers to shutdo

I've done a ton of research on this, and I'm surprised I haven't found a good answer to this yet anywhere.

I'm running a large application on Heroku, and I have certain celery tasks that run for a very long time processing, and at the end of the task save a result. Every time I redeploy on Heroku, it sends SIGTERM (and eventually, SIGKILL) and kills my running worker. I'm trying to find a way for the worker instance to shut itself down gracefully and re-queue itself for processing later so that eventually we can save the required result instead of losing the queued task.

I cannot find a way that works to have the worker listen for SIGTERM properly. The closest I've gotten, which works when running python manage.py celeryd directly but NOT when emulating Heroku using foreman, is the following:

@app.task(bind=True, max_retries=1)
def slow(self, x):
    try:
        for x in range(100):
            print 'x: ' + unicode(x)
            time.sleep(10)
    except exceptions.MaxRetriesExceededError:
        logger.error('whoa')
    except (exceptions.WorkerShutdown, exceptions.WorkerTerminate) as exc:
        logger.error(u'retrying, ' + unicode(exc))
        raise self.retry(exc=exc, countdown=10)
    except (KeyboardInterrupt, SystemExit) as exc:
        print 'retrying'
        raise self.retry(exc=exc, countdown=10)
    else:
        return x
    finally:
        logger.info('task ended!')

When I start this celery task running within foreman and hit Ctrl+C, the following happens:

^CSIGINT received
22:20:59 system   | sending SIGTERM to all processes
22:20:59 web.1    | exited with code 0
22:21:04 system   | sending SIGKILL to all processes
Killed: 9

So it's clear that none of the celery exceptions, nor the KeyboardInterrupt or SystemExit exceptions I've seen in other posts, properly catch SIGTERM and shut down the worker.

What is the right way to do this?

标签： python heroku rabbitmq celery sigterm

2条回答

该账号已被封号

2楼-- · 2019-03-24 03:35

celery was unfortunately not designed to do clean shutdown. EVER. I mean it. celery workers respond to SIGTERM but if a task is incomplete, the worker processes will wait to finish the task and only then exit. In which case, you can send it SIGKILL if the workers don't shut down in a reasonable time but there will be a loss of information in this case i.e. you may not know which jobs remained incomplete.

0人赞添加讨论(0) 举报

叼着烟拽天下

3楼-- · 2019-03-24 04:01

You can use acks_late or task_acks_late.

Tasks will be acknowledged from queue after task executed and not just before. So task will respawn if worker shutdown gracefully.

0人赞添加讨论(0) 举报

Capture Heroku SIGTERM in Celery workers to shutdo

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间