flink-1.10 统一作业提交逻辑

最新推荐文章于 2025-02-12 17:46:11 发布

yuchuanchen

最新推荐文章于 2025-02-12 17:46:11 发布

阅读量2.2k

点赞数 1

CC 4.0 BY-SA版权

分类专栏： flink-1.10 job submit flink 文章标签： flink 大数据

本文链接：https://blog.youkuaiyun.com/yuchuanchen/article/details/104502011

Flink 1.10通过FLIP-73、FLIP-81和FLINK-74解决了作业提交的环境绑定问题，统一了Executor接口，允许通过-D参数动态指定配置，减少了维护的复杂性。JobClient API的引入方便了作业管理和下游工具的开发。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

文章目录

flink-1.10 之前，任务提交通常遇到以下问题：

任务提交由 Execution Environments 负责，并且和部署的环境(yarn k8s mesos)高度绑定，导致最终 Execution Environments 的数量很多，用户针对不同环境需要维护的配置很多，代码复用度比较低。
用户需要针对不同的 job 维护不同的 flink-conf.yaml 配置文件，不能像 spark 那样通过 —D 参数动态指定
用户获取 flink 作业信息只能通过 REST API，下游工具不能很方便的适配

flink-1.10 通过以下 3 个 Flip 解决上述问题

1.FLIP-73 : 通用的 Executor 接口

FLIP-73 中给出以下公式：

最终的 Execution Environments 数量 = API 数量(batch, streaming) × 部署环境数量(local, remote, collection, cli/context) + ε(optimizedPlan, previewPlan)

以 streaming api 对应的 StreamExecutionEnvironment 为例，有以下子类：
在这里插入图片描述
StreamExecutionEnvironment 类的 execute() 方法声明为 abstract，每个子类需要实现各自的 execute() 方法

public abstract JobExecutionResult execute(StreamGraph streamGraph) throws Exception;

在 Flink 1.10 中，作业提交逻辑被抽象到了通用的 Executor 接口 (PipelineExecutor)(FLIP-73).

// PipelineExecutor.java
/**
 * The entity responsible for executing a {@link Pipeline}, i.e. a user job.
 */
@Internal
public interface PipelineExecutor {
   
   

	/**
	 * Executes a {@link Pipeline} based on the provided configuration and returns a {@link JobClient} which allows to
	 * interact with the job being executed, e.g. cancel it or take a savepoint.
	 *
	 * <p><b>ATTENTION:</b> The caller is responsible for managing the lifecycle of the returned {@link JobClient}. This
	 * means that e.g. {@code close()} should be called explicitly at the call-site.
	 *
	 * @param pipeline the {@link Pipeline} to execute
	 * @param configuration the {@link Configuration} with the required execution parameters
	 * @return a {@link CompletableFuture} with the {@link JobClient} corresponding to the pipeline.
	 */
	CompletableFuture<JobClient> execute(final Pipeline pipeline, final Configuration configuration)