Is there a way in airflow of using the depends_on_past
for an entire DagRun, not just applied to a Task?
I have a daily DAG, and the Friday DagRun errored on the 4th task however the Saturday and Sunday DagRuns still ran as scheduled. Using depends_on_past = True
would have paused the DagRun on the same 4th task, however the first 3 tasks would still have run.
I can see in the DagRun DB table there is a state
column that contains failed
for the Friday DagRun. What I want is a way configuring a DagRun to not start if the previous DagRun failed, not start and run until finding a Task that previously failed.
Does anyone know if this is possible?
At your first task, set
depends_on_past=True
andwait_for_downstream=True
, the combination will result in that current dag-run runs only if the last run succeeded.Because by setting the first task at current dag-run would waits for previous (depends_on_past) and all tasks (wait_for_downstream) to succeed
One possible solution would be to use
xcom
:start_task
andend_task
to the DAG.start_task
end_task
depend on all other tasks (set_upstream
).end_task
will always push a variablelast_success = context['execution_date']
to xcom (xcom_push
). (Requiresprovide_context = True
in the PythonOperators).start_task
will always check xcom (xcom_pull
) to see whether there exists alast_success
variable with value equal to the previous DagRun's execution_date or to the DAG's start_date (to let the process start).Example use of xcom:
https://github.com/apache/incubator-airflow/blob/master/airflow/example_dags/example_xcom.py