
客户端提交任务的方式主要有以下几种
- 命令行
- REST接口
- SQL
- python
- scala
JobManager有三种提交任务的模式
- Application Mode: runs the cluster exclusively for one application. The job’s main method (or client) gets executed on the JobManager. Calling
execute/executeAsyncmultiple times in an application is supported.
1.11版本以后才会有的。本文暂不涉及。 - Per-Job Mode: runs the cluster exclusively for one job. The job’s main method (or client) runs only prior to the cluster creation.
每一个任务都会启动一个集群。 - Session Mode: one JobManager instance manages multiple jobs sharing the same cluster of TaskManagers
集群只有一个JobManager。各个job共享TaskManager

部署方式
无论如何部署,都是大概这四种进程。注意图片最下面master/yarn只是代表部署在yarn的AM节点。

+
对于JM,有三个组成部分
The JobManager has a number of responsibilities related to coordinating the distributed execution of Flink Applications: it decides when to schedule the next task (or set of tasks), reacts to finished tasks or execution failures, coordinates checkpoints, and coordinates recovery on failures, among others. This process consists of three different components:
ResourceManager 注意这个RM和yarn的rm两件事情。
The ResourceManager is responsible for resource de-/allocation and provisioning in a Flink cluster — it manages task slots, which are the unit of resource scheduling in a Flink cluster (see TaskManagers). Flink implements multiple ResourceManagers for different environments and resource providers such as YARN, Mesos, Kubernetes and standalone deployments. In a standalone setup, the ResourceManager can only distribute the slots of available TaskManagers and cannot start new TaskManagers on its own.
Dispatcher
The Dispatcher provides a REST interface to submit Flink applications for execution and starts a new JobMaster for each submitted job. It also runs the Flink WebUI to provide information about job executions.
JobMaster
A JobMaster is responsible for managing the execution of a single JobGraph. Multiple jobs can run simultaneously in a Flink cluster, each having its own JobMaster.
Resorce Provider的不同,部署方式有如下四种
Standalone:最基本模式。
Kubernetes
YARN
Mesos
Flink Standalone

Kubernetes
https://zhuanlan.zhihu.com/p/108302052?utm_source=wechat_timeline
YARN

或者是这张超级牛逼的图

Mesos
外部依赖
高可用服务
主要是避免JobManager崩溃。会有多个备用JobManager在主崩溃后借助高可用服务迅速恢复。主要提供是
- Zookeeper
- Kubernetes HA
持久化服务
主要是依赖各种本地或者远程的文件系统
资源提供服务
取决于部署方式。
Metrics Storage
Application-level data sources and sinks
外部输入输出。比如kafka,elasticSearch,Cassandra
本文介绍了Flink中客户端任务提交的多种方式,如命令行、REST接口等,并详细解读了JobManager的ApplicationMode、Per-JobMode和SessionMode。此外,文章探讨了不同部署模式(Standalone、Kubernetes、YARN、Mesos)下的资源管理和高可用性策略,涉及ResourceManager、Dispatcher、JobMaster以及Zookeeper、Kubernetes HA等服务。
1074

被折叠的 条评论
为什么被折叠?



