Hadoop源码解析之ApplicationMaster启动流程

本文从源码调用方面介绍从应用程序提交到启动ApplicationMaster的整个过程,期间涉及ClientRMService、RMAppManager、RMAppImpl、RMAppAttemptImpl、RMNode、ResourceScheduler等几个主要组件。

当客户端调用RPC函数ApplicationClientProtocol#submitApplication之后,ResourceManager端的处理过程如下:


步骤1:

ResourceManager中的ClientRMService实现了ApplicationClientProtocol协议,它处理来自客户端的请求,并调用RMAppManager#submitApplication通知其他相关服务作进一步处理。

//clientRMService.java
public SubmitApplicationResponse submitApplication(
      SubmitApplicationRequest request) throws YarnException {
	  ...
	  
    try {
      // call RMAppManager to submit application directly
      //开始提交作业 
      rmAppManager.submitApplication(submissionContext,
          System.currentTimeMillis(), user);

      LOG.info("Application with id " + applicationId.getId() + 
          " submitted by user " + user);
      RMAuditLogger.logSuccess(user, AuditConstants.SUBMIT_APP_REQUEST,
          "ClientRMService", applicationId);
    } catch (YarnException e) {
     ...
     }
    return response;
  }

步骤2:

RMAppManager为该应用程序创建一个RMAppImpl对象以维护它的运行状态,并发送一个RMAppEventType.START事件。

//RMAppManager.java
protected void submitApplication(
      ApplicationSubmissionContext submissionContext, long submitTime,
      String user) throws YarnException {
	  
	  LOG.info("begin to submitApplication");
	  //获得作业ID
    ApplicationId applicationId = submissionContext.getApplicationId();

    //构建一个app并放入applicationACLS 
    RMAppImpl application =
        createAndPopulateNewRMApp(submissionContext, submitTime, user);
    ApplicationId appId = submissionContext.getApplicationId();

    if (UserGroupInformation.isSecurityEnabled()) {
    	...
    } else {
      // Dispatcher is not yet started at this time, so these START events
      // enqueued should be guaranteed to be first processed when dispatcher
      // gets started.
    	//触发app启动事件 
    	LOG.info("send  event RMAppEventType.START");
    	LOG.info("this.rmContext="+this.rmContext.toString());
    	//this.rmContext=RMContextImpl
      this.rmContext.getDispatcher().getEventHandler()
        .handle(new RMAppEvent(applicationId, RMAppEventType.START));
    }
  }

步骤3:

RMAppImpl收到RMAppEventType.START事件后,会调用RMStateStore#storeApplication,以日志记录RMAppImpl当前信息,

至此,RMAppImpl的运行状态由NEW转移为NEW_SAVING。该步骤就较为复杂了,下面详细介绍下。

其中RMAppEventType注册到中央异步调度器的地方在ResourceManager.java中:

//ResourceManager.java
protected void serviceInit(Configuration configuration) throws Exception {
      conf.setBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY, true);
			...
			rmDispatcher.register(SchedulerEventType.class, schedulerDispatcher);

      // Register event handler for RmAppEvents
      rmDispatcher.register(RMAppEventType.class,
          new ApplicationEventDispatcher(rmContext));

      // Register event handler for RmAppAttemptEvents
      rmDispatcher.register(RMAppAttemptEventType.class,
          new ApplicationAttemptEventDispatcher(rmContext));

      // Register event handler for RmNodes
      rmDispatcher.register(
          RMNodeEventType.class, new NodeEventDispatcher(rmContext));
       ...
      }

上面的this.rmContext=RMContextImpl,

this.rmContext.getDispatcher()=AsyncDispatcher,

this.rmContext.getDispatcher().getEventHandler()=AsyncDispatcher$GenericEventHandler

所以会进入AsyncDispatcher类中的内部类GenericEventHandler的函数handle中

//AsyncDispatcher.java
class GenericEventHandler implements EventHandler<Event> {
    public void handle(Event event) {
    	LOG.info("begin to call GenericEventHandler::handle, event= "+event.toString());
      if (blockNewEvents) {
        return;
      }
      drained = false;

      /* all this method does is enqueue all the events onto the queue */
      int qSize = eventQueue.size();
      if (qSize !=0 && qSize %1000 == 0) {
        LOG.info("Size of event-queue is " + qSize);
      }
      int remCapacity = eventQueue.remainingCapacity();
      if (remCapacity < 1000) {
        LOG.warn("Very low remaining capacity in the event-queue: "
            + remCapacity);
      }
      try {
    	  LOG.info("begin to put event in queue.");
        eventQueue.put(event);
      } catch (InterruptedException e) {
        if (!stopped) {
          LOG.warn("AsyncDispatcher thread interrupted", e);
        }
        throw new YarnRuntimeException(e);
      }
    };
  }
handle函数里,最终把event事件放进了队列eventQueue中:eventQueue.put(event);

注意这个异步调度器AsyncDispatcher类是公用的。

RMAppEventType.START事件放入队列eventQueue中,会被RMAppImpl类获取,进入其handle函数

//RMAppImpl.java
public void handle(RMAppEvent event) {

    this.writeLock.lock();

    try {
      ApplicationId appID = event.getApplicationId();
      LOG.info("Processing event for " + appID + " of type "
          + event.getType());
      final RMAppState oldState = getState();
      try {
        /* keep the master in sync with the state machine */
        this.stateMachine.doTransition(event.getType(), event);
      } catch (InvalidStateTransitonException e) {
        LOG.error("Can't handle this event at current state", e);
        /*  fail the application on the failed transition */
      }

      if (oldState != getState()) {
        LOG.info(appID + " State change from " + oldState + " to "
            + getState());
      }
    } finally {
      this.writeLock.unlock();
    }
  }

这里面的关键语句是

this.stateMachine.doTransition(event.getType(), event);

这个stateMachine是个状态机工厂,其中绑定了很多的事件转换:

//RMAppImpl.java
private static final StateMachineFactory<RMAppImpl,
                                           RMAppState,
                                           RMAppEventType,
                                           RMAppEvent> stateMachineFactory
                               = new StateMachineFactory<RMAppImpl,
                                           RMAppState,
                                           RMAppEventType,
                                           RMAppEvent>(RMAppState.NEW)


     // Transitions from NEW state
    .addTransition(RMAppState.NEW, RMAppState.NEW,
        RMAppEventType.NODE_UPDATE, new RMAppNodeUpdateTransition())
    .addTransition(RMAppState.NEW, RMAppState.NEW_SAVING,
        RMAppEventType.START, new RMAppNewlySavingTransition())
    .addTransition(...)
    ...
    addTransition(...)
其中第二个就是

addTransition(RMAppState.NEW, RMAppState.NEW_SAVING, RMAppEventType.START, new RMAppNewlySavingTransition())

意思就是接受RMAppEventType.START类型的事件,将状态由RMAppState.NEW转换为RMAppState.NEW_SAVING,调用的回调类是RMAppNewlySavingTransition。

在addTransition函数中,就将第二个参数postState传给了新构建的内部类SingleInternalArc

//StateMachineFactory.java
public StateMachineFactory
             <OPERAND, STATE, EVENTTYPE, EVENT>
          addTransition(STATE preState, STATE postState,
                        EVENTTYPE eventType,
                        SingleArcTransition<OPERAND, EVENT> hook){
    return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>
        (this, new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT>
           (preState, eventType, new SingleInternalArc(postState, hook)));
  }

初始化的内部类SingleInternalArc中,

保存了状态转换之后的值postState,此时的值就是RMAppState.NEW_SAVING。

也保存了回调函数hook=RMAppNewlySavingTransition。

//StateMachineFactory.java
SingleInternalArc(STATE postState,
        SingleArcTransition<OPERAND, EVENT> hook) {
      this.postState = postState;
      this.hook = hook;
    }

返回到RMAppImpl类的handle函数中,调用this.stateMachine.doTransition(e

评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值