Understanding the DummyOperator in Apache Airflow: A Simple Guide
Vidushraj Chandrasekaran
Data Engineer???? | GCP Certified Data Engineer | MS Certified Data Engineer | 6x Azure | Data Engineering | BSc (Hons) in EEE | AMIE(SL) | AEng(ECSL)
Apache Airflow is an open-source platform for developing, scheduling, and monitoring batch-oriented workflows. It offers different types of operators that enable the creation and automation of workflows. Among these, DummyOperator stands out as a simple operator in Airflow.
What is DummyOperator?
As the name suggests it's a kind of operator in Apache Airflow that does precisely nothing. It might sound a bit confusing however, its purpose lies in providing structure and control within DAGs.
Why we are using DummyOperator?
How to create?
In Apache Airflow, you can import the DummyOperator from the airflow.operators.dummy_operator module.
领英推荐
from airflow.operators.dummy_operator import DummyOperator
DummpyOperator instantiated using the below code snippet. This will create a task in Airflow DAG named 'dummy_operator_task'. When you trigger the DAG, it does not perform any action but the status will shown as success.
from airflow import DAG
from datetime import datetime, timedelta
from airflow.operators.dummy_operator import DummyOperator
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2023,11,16),
'retries':0
}
dag = DAG(
dag_id='DAG-1',
default_args=default_args,
catchup=False,
schedule_interval='@once'
)
dummy_operator_task = DummyOperator(
task_id = 'dummy_operator_task',
dag = dag
)
Example use case:
Consider a scenario when you are designing a data pipeline you might use DummpyOperator as the starting point in your DAG followed by the actual tasks related to the pipeline. The DummyOperator acts as an anchor, indicating the beginning of the workflow.
In conclusion, the DummyOperator might not perform tangible work, but its presence significantly contributes to the efficiency and clarity of your Airflow workflows.