Accessing configuration parameters passed to Airflow via CLI
I am trying to pass the following configuration parameters to the Airflow CLI when running dag startup. Below is the trigger_dag command I am using.
airflow trigger_dag -c '{"account_list":"[1,2,3,4,5]", "start_date":"2016-04-25"}' insights_assembly_9900
My problem is how can I access the con parameters passed inside the statement in the dag run.
source to share
This is probably a continuation of the answer provided devj
.
-
The
airflow.cfg
following property must be set to true:dag_run_conf_overrides_params=True
-
Defining PythonOperator, pass the following argument:
provide_context=True
. For example:
get_row_count_operator = PythonOperator (task_id = 'get_row_count', python_callable = do_work, dag = dag, provide_context = True)
- Define a callable python (note the usage
**kwargs
):
def do_work (** kwargs): table_name = kwargs ['dag_run']. conf.get ('table_name') # Rest of the code
- Call dag from the command line:
airflow trigger_dag read_hive --conf '{"table_name": "my_table_name"}'
I found this discussion helpful.
source to share
There are two ways to access the parameters passed in the command airflow trigger_dag
.
-
In a callable method defined in PythonOperator, you can access the parameters as
kwargs['dag_run'].conf.get('account_list')
-
given the field you are using this thing in is a template one can use
{{ dag_run.conf['account_list'] }}
schedule_interval
for DAG with an external trigger set as None
for the above approaches to work
source to share