Service Name:org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.ContainerAllocationExpirer
Service Name:AMLivelinessMonitor
Service Name:AMLivelinessMonitor
Service Name:org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer
Service Name:org.apache.hadoop.yarn.server.resourcemanager.NodesListManager
Service Name:org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher
Service Name:NMLivelinessMonitor
Service Name:org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService
Service Name:org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService
Service Name:org.apache.hadoop.yarn.server.resourcemanager.ClientRMService
Service Name:org.apache.hadoop.yarn.server.resourcemanager.AdminService
Service Name:org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher
1. ResourceManager启动时,会启动13个Services来完成它的功能(如上)。
2. 当一个App被提交时,ClientRMService会接收到App的相关信息,然后将App提交给AppManager,并发出APP_SUBMIT事件。
3. AppManager接收到App后生成RMAppImpl发送START事件,RMAppImpl收到事件后通过transition创建Attempt,app.createNewAttempt(),同时发送START事件给Attempt。
4. RMAppAttemptImpl收到START事件后,通过transition将App注册给ApplicationMasterService,appAttempt.masterService.registerAppAttempt (appAttempt.applicationAttemptId),并生成AppAddedSchedulerEvent对象,调用ScheduleTransition的transition方法,发送APP_ACCEPTED事件。
5. 发送完APP_ACCEPTED事件后,为AppMaster分配Container,
Allocation amContainerAllocation = appAttempt.scheduler.allocate(
appAttempt.applicationAttemptId, Collections.singletonList(request), EMPTY_CONTAINER_RELEASE_LIST);
(scheduler在默认情况下是CapacityScheduler类)
6. RMNode 接受 NM的心跳STATUS_UPDATE,触发CapacityScheduler的NODE_UPDATE事件,从调度器获取一个Container分配给MRAppMaster。
7. RMAppAttempt接收到CONTAINER_ALLOCATED事件后,向ApplicationMasterLauncher发送LAUNCH请求,AML通知NM启动Container。
8. MRAppMaster启动后向RM注册,MRAppAttempt状态更新为RUNNING,MRAppMaster开始工作。
本文详细阐述了 YARN ResourceManager 在启动时初始化的 13 个关键服务及其作用,并深入解析了一个应用程序从提交到运行的整个生命周期管理流程,包括 ResourceManager 如何与 Application Master 和 Node Manager 进行交互。
101

被折叠的 条评论
为什么被折叠?



