How to install airflow?

2020-06-07 07:26发布

问题:

I seem to be doing sth. wrong.

https://pythonhosted.org/airflow/start.html

$ export AIRFLOW_HOME=~/airflow
$ pip install apache-airflow
Requirement already satisfied
$ airflow initdb
airflow: Command not found 

python --version
Python 2.7.10

It's weird - the installation seemed to have worked fine (with some warnings - nothing serious) saying: airflow, flask, etc. successfully installed. But even after restarting the PC (Ubuntu 15.10) airflow seems not to be a command.

回答1:

  • You can create a virtual environment for Airflow to keep it as a separate entity: virtualenv airflow_virtualenv
  • Go to the bin folder the virtual env: cd airflow_virtualenv/bin
  • Activate the virtual env: source activate
  • Set the airflow home path: export AIRFLOW_HOME=~/airflow [You can also put this statement in your ~/.profile or ~/.bashrc file so that you don't have to export every time]
  • Install Airflow: pip install apache-airflow [If it throws the "six" error while installing then run: pip install apache-airflow --ignore-installed six]
  • Initialize the database: airflow initdb
  • Start the webserver: airflow webserver -p 8080
  • View the Airflow UI: http://localhost:8080/


回答2:

Your steps look correct, if you haven't omitted anything else. But you could try Python virtualenv and virtualenvwrapper with following steps to have an isolated airflow environment.

pip install virtualenv
pip install virtualenvwrapper
# update and source your .profile
mkvirtualenv airflow
workon airflow
export AIRFLOW_VERSION=1.7.0
pip install airflow==${AIRFLOW_VERSION}
# optionally other modules
#pip install airflow[celery]==${AIRFLOW_VERSION}


回答3:

Using Python 3.6

export AIRFLOW_HOME="/Users/your_user_name/airflow"
export SLUGIFY_USES_TEXT_UNIDECODE=yes
brew install python python3
pip install -U apache-airflow[celery,s3,postgres,jdbc,mysql,gcp_api,crypto,kubernetes]

Using Python 3.7 There are some issues during installing about

from tenacity.async import AsyncRetrying

Airflow worked with Python 3.7, so there is one PR on incubating side,
just need to bump up a version for a dependency name tenacity: http://mail-archives.apache.org/mod_mbox/airflow-commits/201808.mbox/%3CJIRA.13177795.1533763090000.42816.1533763380326@Atlassian.JIRA%3E https://issues.apache.org/jira/browse/AIRFLOW-2876 pip install tenacity==4.12.0

Now run Ariflow:

airflow initdb
airflow webserver

Verify the app is running in the browser by visiting http://localhost:8080.

Then run:

airflow scheduler



回答4:

Here are the steps I followed to install Airflow:

Set the airflow home in ~/.bashrc

export AIRFLOW_HOME=~/airflow

Install from pypi using pip

pip install airflow

initialising the metadata DB

airflow initdb

starting the webserver

airflow webserver -p 8080

open browser and go to localhost:8080 to view and use UI



回答5:

In addition to the above commands, you might have to start the scheduler to allow running of jobs. The command is,

airflow scheduler



回答6:

I tried both pip install apache-airflow and pip3 install apache airflow and both had issues because it installed everything in ~/.local/bin/

If you get the error that you cannot run airflow, you will find it in ~/.local/bin/airflow. Then you can add the alias to your .bashrc: alias airflow='~/.local/bin/airflow' then run bash and you will be able to run airflow.

Then when you try to run the webserver with either the python2 or python3 version it will throw an error because it cannot find gunicorn, and you can fix that by adding ~/.local/bin to the PATH:

export PATH=$PATH:~/.local/bin



回答7:

This seems like the path to airflow is not in your PATH. does this happen with other python packages?

try:

export PATH=$PATH:/usr/local/bin/

this is the default path for airflow and should make it work



回答8:

The solution the worked for was to create an environment, install airflow and then was able to run it.

-> Install virtualenv: $pip install virtualenv

-> Create environment: $python -m venv myvenv

-> Activate environment: $source myenv/bin/activate

-> Install airflow : (myenv)$pip install airflow with postgres: pip install airflow[postgres]

->Start the server: (myenv)$airflow webserver -p 8080



回答9:

An important addition to all posts.

Apache Airflow changes its package name from airflow to apache-airflow. So all posts in this thread would install Apache Airflow 1.8 as this package still exists.

To install a later version

export AIRFLOW_HOME=~/airflow
pip install apache-airflow

Also consider which Python version to take. You may install airflow with Python 2 or Python 3.



回答10:

A lot of answers and no one mentioned containers. From my perspective airflow in docker, it's much easier, especially for development. Here is probably the best project which supports airflow docker containers.

Here is an example of docker-compose you can take as an example:

version: '2'
services:
  postgresql:
    image: bitnami/postgresql:latest
    volumes:
      - postgresql_data:/bitnami/postgresql
    env_file:
      - .env
    ports:
      - 5432:5432
  redis:
    image: bitnami/redis:latest
    env_file:
      - .env
    volumes:
      - redis_data:/bitnami
  airflow-worker:
    image: bitnami/airflow-worker:latest
    env_file:
      - .env
    volumes:
      - ./dags:/opt/bitnami/airflow/dags
      - ./plugins:/opt/bitnami/airflow/plugins
  airflow-scheduler:
    image: bitnami/airflow-scheduler:latest
    depends_on:
      - redis
    env_file:
      - .env
    volumes:
      - ./dags:/opt/bitnami/airflow/dags
      - ./plugins:/opt/bitnami/airflow/plugins
  airflow:
    image: bitnami/airflow:latest
    depends_on:
      - postgresql
    env_file:
      - .env
    ports:
      - 8080:8080
    volumes:
      - ./dags:/opt/bitnami/airflow/dags
      - ./plugins:/opt/bitnami/airflow/plugins
volumes:
  postgresql_data:
    driver: local
  redis_data:
    driver: local

And corresponding .env file:

AIRFLOW_EXECUTOR=CeleryExecutor
AIRFLOW_PASSWORD=admin
AIRFLOW_USERNAME=admin
LOAD_EXAMPLES=no

ALLOW_EMPTY_PASSWORD=yes
AIRFLOW__CORE__FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
AIRFLOW__CORE__DAG_DISCOVERY_SAFE_MODE=false
POSTGRESQL_DATABASE=bitnami_airflow
POSTGRESQL_USERNAME=bn_airflow
POSTGRESQL_PASSWORD=bitnami1

Regarding the plugins dir, it's reserved directory where you can place your custom scripts/lib etc. and airflow will automatically pick them up.



回答11:

I have macOs Mojave. Airflow comes with default sqlite db which doesnt allow parallel processing. Installation is difficult.

brew install autoconf
brew install automake
brew install pkg-config
brew install libtool

Then create a virtual env for python3.7. In venv type :

pip install "apache-airflow[celery, crypto, postgres, rabbitmq, redis]==1.10.6"

Airflow will be generally installed at

‘/Library/Frameworks/Python.framework/Versions/3.7/bin/airflow’

add this path to your path variable as shown here Link

now type following in you virtual env that you created above for airflow.

airflow initdb

If this works you have installed airflow successfully.



回答12:

It worked for me :

$SLUGIFY_USES_TEXT_UNIDECODE=yes pip3 install apache-airflow