Start SQS celery worker on Elastic Beanstalk

2019-08-21 15:31发布

问题:

I am trying to start a celery worker on EB but get an error which doesn't explain much.

Command in config file in .ebextensions dir:

03_celery_worker:
  command: "celery worker --app=config --loglevel=info -E --workdir=/opt/python/current/app/my_project/"

The listed command works fine on my local machine (just change workdir parameter).

Errors from the EB:

Activity execution failed, because: /opt/python/run/venv/local/lib/python3.6/site-packages/celery/platforms.py:796: RuntimeWarning: You're running the worker with superuser privileges: this is absolutely not recommended!

and

Starting new HTTPS connection (1): eu-west-1.queue.amazonaws.com (ElasticBeanstalk::ExternalInvocationError)

I have updated celery worker command with parameter --uid=2 and privileges error disappeared but command execution is still failed due to

ExternalInvocationError

Any suggestions what I do wrong?

回答1:

ExternalInvocationError

As I understand it means that listed command cannot be run from EB container commands. It is needed to create a script on the server and run celery from the script. This post describes how to do it.

Update: It is needed to create a config file in .ebextensions directory. I called it celery.config. Link to the post above provides a script which works almost fine. It is needed to make some minor additions to work 100% correct. I had issues with schedule periodic tasks (celery beat). Below are steps on how to fix is:

  1. Install (add to requirements) django-celery beat pip install django-celery-beat, add it to installed apps and use --scheduler parameter when starting celery beat. Instructions are here.

  2. In the script you specify user which run the script. For celery worker it is celery user which was added earlier in the script (if doesn't exist). When I tried to start celery beat I got error PermissionDenied. It means that celery user doesn't have all necessary rights. using ssh I logged in to EB, looked a list of all users (cat /etc/passwd) and decided to use daemon user.

Listed steps resolved celery beat errors. Updated config file with the script is below (celery.config): ``` files: "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh": mode: "000755" owner: root group: root content: | #!/usr/bin/env bash

  # Create required directories
  sudo mkdir -p /var/log/celery/
  sudo mkdir -p /var/run/celery/

  # Create group called 'celery'
  sudo groupadd -f celery
  # add the user 'celery' if it doesn't exist and add it to the group with same name
  id -u celery &>/dev/null || sudo useradd -g celery celery
  # add permissions to the celery user for r+w to the folders just created
  sudo chown -R celery:celery /var/log/celery/
  sudo chown -R celery:celery /var/run/celery/

  # Get django environment variables
  celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/%/%%/g' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g'`
  celeryenv=${celeryenv%?}

  # Create CELERY configuration script
  celeryconf="[program:celeryd]
  directory=/opt/python/current/app
  ; Set full path to celery program if using virtualenv
  command=/opt/python/run/venv/bin/celery worker -A config.celery:app --loglevel=INFO --logfile=\"/var/log/celery/%%n%%I.log\" --pidfile=\"/var/run/celery/%%n.pid\"

  user=celery
  numprocs=1
  stdout_logfile=/var/log/celery-worker.log
  stderr_logfile=/var/log/celery-worker.log
  autostart=true
  autorestart=true
  startsecs=10

  ; Need to wait for currently executing tasks to finish at shutdown.
  ; Increase this if you have very long running tasks.
  stopwaitsecs = 60

  ; When resorting to send SIGKILL to the program to terminate it
  ; send SIGKILL to its whole process group instead,
  ; taking care of its children as well.
  killasgroup=true

  ; if rabbitmq is supervised, set its priority higher
  ; so it starts first
  priority=998

  environment=$celeryenv"


  # Create CELERY BEAT configuraiton script
  celerybeatconf="[program:celerybeat]
  ; Set full path to celery program if using virtualenv
  command=/opt/python/run/venv/bin/celery beat -A config.celery:app --loglevel=INFO --scheduler django_celery_beat.schedulers:DatabaseScheduler --logfile=\"/var/log/celery/celery-beat.log\" --pidfile=\"/var/run/celery/celery-beat.pid\"

  directory=/opt/python/current/app
  user=daemon
  numprocs=1
  stdout_logfile=/var/log/celerybeat.log
  stderr_logfile=/var/log/celerybeat.log
  autostart=true
  autorestart=true
  startsecs=10

  ; Need to wait for currently executing tasks to finish at shutdown.
  ; Increase this if you have very long running tasks.
  stopwaitsecs = 60

  ; When resorting to send SIGKILL to the program to terminate it
  ; send SIGKILL to its whole process group instead,
  ; taking care of its children as well.
  killasgroup=true

  ; if rabbitmq is supervised, set its priority higher
  ; so it starts first
  priority=999

  environment=$celeryenv"

  # Create the celery supervisord conf script
  echo "$celeryconf" | tee /opt/python/etc/celery.conf
  echo "$celerybeatconf" | tee /opt/python/etc/celerybeat.conf

  # Add configuration script to supervisord conf (if not there already)
  if ! grep -Fxq "celery.conf" /opt/python/etc/supervisord.conf
    then
      echo "[include]" | tee -a /opt/python/etc/supervisord.conf
      echo "files: uwsgi.conf celery.conf celerybeat.conf" | tee -a /opt/python/etc/supervisord.conf
  fi

  # Enable supervisor to listen for HTTP/XML-RPC requests.
  # supervisorctl will use XML-RPC to communicate with supervisord over port 9001.
  # Source: https://askubuntu.com/questions/911994/supervisorctl-3-3-1-http-localhost9001-refused-connection
  if ! grep -Fxq "[inet_http_server]" /opt/python/etc/supervisord.conf
    then
      echo "[inet_http_server]" | tee -a /opt/python/etc/supervisord.conf
      echo "port = 127.0.0.1:9001" | tee -a /opt/python/etc/supervisord.conf
  fi

  # Reread the supervisord config
  supervisorctl -c /opt/python/etc/supervisord.conf reread

  # Update supervisord in cache without restarting all services
  supervisorctl -c /opt/python/etc/supervisord.conf update

  # Start/Restart celeryd through supervisord
  supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd
  supervisorctl -c /opt/python/etc/supervisord.conf restart celerybeat

commands: 01_killotherbeats: command: "ps auxww | grep 'celery beat' | awk '{print $2}' | sudo xargs kill -9 || true" ignoreErrors: true 02_restartbeat: command: "supervisorctl -c /opt/python/etc/supervisord.conf restart celerybeat" leader_only: true ``` One thing to focus attention on: in my project celery.py file is in the config directory, that is why I write -A config.celery:app when start celery worker and celery beat