Airflow 中Operator

在 Apache Airflow 中,Operator 是定义任务的核心组件。每个 Operator 代表一个具体的任务类型,例如运行 Python 脚本、执行 Bash 命令、调用 HTTP 接口等。以下是几种常用 Operator 的详细介绍:


1. DummyOperator

  • 作用:

    • DummyOperator 是一个空任务,不执行任何实际操作,通常用于占位或控制任务流的依赖关系。

    • 它常用于标记任务流的开始或结束,或者作为条件分支的占位符。

    • 可以通过 DummyOperator 控制下游或上游任务的 clear 或 mark success 操作。

  • 使用场景:

    • 占位任务,用于构建复杂的工作流结构。

    • 控制任务流的依赖关系。

  • 示例:

    from airflow import DAG
    from airflow.operators.dummy import DummyOperator
    from datetime import datetime
    
    dag = DAG(
        'dummy_example',
        start_date=datetime(2023, 1, 1),
        schedule_interval='@daily',
    )
    
    start = DummyOperator(task_id='start', dag=dag)
    end = DummyOperator(task_id='end', dag=dag)
    
    start >> end

2. PythonOperator

  • 作用:

    • PythonOperator 用于执行 Python 函数。

Airflow PythonOperator is a task in Apache Airflow that allows you to execute a Python function as a task within an Airflow DAG (Directed Acyclic Graph). It is one of the most commonly used operators in Airflow. The PythonOperator takes a python_callable argument, which is the function you want to execute, and any other necessary arguments for that function. When the task is executed, Airflow will call the specified Python function and perform the logic inside it. Here's an example of how to use PythonOperator in an Airflow DAG: ```python from airflow import DAG from airflow.operators.python_operator import PythonOperator from datetime import datetime def my_python_function(): # Your logic here print("Hello, I am running inside a PythonOperator") dag = DAG( 'my_dag', start_date=datetime(2022, 1, 1), schedule_interval='@daily' ) my_task = PythonOperator( task_id='my_task', python_callable=my_python_function, dag=dag ) ``` In this example, we define a DAG called 'my_dag' with a daily schedule interval. We then create a PythonOperator called 'my_task' that executes the function `my_python_function`. Whenever the DAG is triggered, Airflow will execute the logic inside `my_python_function`. You can add more parameters to the PythonOperator based on your requirements, such as providing arguments to the python_callable function or defining the pool for task execution. The output of the function can also be used by downstream tasks in the DAG. I hope this answers your question about the Airflow PythonOperator! Let me know if you have any further queries.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值