spark on yarn集群的提交过程
- 当程序提交时会启动sparkSubmit进程,里面会有解析参数对象(sparkSubmitArguments)和连接yarn的对象(yarnClusterApplication、yarnClient),连接yarn的对象连接resourceManager,请求启动ApplicationMaster
- resourceManager启动ApplicationMaster
- ApplicationMaster启动driver并且初始化sparkContext
- ApplicationMaster反向注册resourceManager请求资源
- resourceManager返回可用的资源列表
- ApplicationMaster启动Executor执行后台 yarnCoarseGrainedExecutorBackend(用于和yarn进行通信,启动Executor)
- yarnCoarseGrainedExecutorBackend向Driver反向注册
- Driver返回注册成功
- yarnCoarseGrainedExecutorBackend启动Executor
如下图: