Airflow DAG dynamic structure

2019-08-05 01:59发布

问题:

I was looking for a solution where I can decide the dag structure when the dag is triggered as I'm not sure about the number of operators that I'll have to run.

Please refer below for the execution sequence that I'm planning to create.

           |-- Task B.1 --|                  |-- Task C.1 --|
           |-- Task B.2 --|                  |-- Task C.2 --|
  Task A --|-- Task B.3 --|---> Task  B ---> |-- Task C.3 --|
           |     ....     |                  |     ....     |
           |-- Task B.N --|                  |-- Task C.N --|

I'm not sure about the value of N.

Is this possible in airflow. If so, how do I achieve this.

Thanks in Advance

回答1:

I had to do something similar in the past, I wrote a DAG which read from a YAML file which defined what tasks to create.

My situation was that the number of tables that I was extracting data from could change every week, instead of re-deploying the DAG to production every time I needed to add a new table I pointed the DAG to a YAML file which described which tables to extract. Every time a new table came along I would simply edit the YAML file with the new table details.

I think it gets a bit trickier if an upstream task needs to be run first which then determines how many downstream tasks to run like in the following - but similar - question:

Generating dynamic tasks in airflow based on output of an upstream task