一、启动从脚本开始
- 启动jobmanager会调用脚本jobmanager.sh start
- 从jobmanager.sh中知道,启动jobmanager最终调用的是flink-daemon.sh
if [[ $STARTSTOP == "start-foreground" ]]; then
exec "${FLINK_BIN_DIR}"/flink-console.sh $JOBMANAGER_TYPE "${args[@]}"
else
"${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP $JOBMANAGER_TYPE "${args[@]}"
fi
- 探索flink-daemon.sh得知,启动jobmanager时,调用的org.apache.flink.runtime.jobmanager.JobManager
case $DAEMON in
(jobmanager)
CLASS_TO_RUN=org.apache.flink.runtime.jobmanager.JobManager
;;
二、探索JobManager
- 1、进入JobManager的main方法,main中对启动线程运行runJobManager
SecurityUtils.getInstalledContext.runSecured(new Callable[Unit] {
override def call(): Unit = {
runJobManager(
configuration,
executionMode,
externalHostName,
portRange)
}
})
-
2、runJobManager逻辑:
1、根据port创建socket,找到可分配给jobmanager的port
2、调用重载方法runJobManager
def runJobManager(
configuration: Configuration,
executionMode: JobManagerMode,
listeningAddress: String,
listeningPortRange: java.util.Iterator[Integer])
: Unit = {
val result = AkkaUtils.retryOnBindException({
// Try all ports in the range until successful
val socket = NetUtils.createSocketFromPorts(
listeningPortRange,
new NetUtils.SocketFactory {
override def createSocket(port: Int): ServerSocket = new ServerSocket(
// Use the correct listening address, bound ports will only be
// detected later by Akka.
port, 0, InetAddress.getByName(NetUtils.getWildcardIPAddress))
})
val port =
if (socket == null) {
throw new BindException(s"Unable to allocate port for JobManager.")
} else {
try {
socket.getLocalPort()
} finally {
socket.close()
}
}
runJobManager(configuration, executionMode, listeningAddress, port)
}, { !listeningPortRange.hasNext }, 5000)
result match {
case scala.util.Failure(f) => throw f
case _ =>
}
}
- 3、创建了jobmanagerActorSystem(jobmanager与taskmanager是根据AkkaActor来通信)
val jobManagerSystem = startActorSystem(
configuration,
listeningAddress,
listeningPort)
- 4、创建并启动jobmanagerActor
val (jobManager, archive) = startJobManagerActors(
configuration,
jobManagerSystem,
futureExecutor,
ioExecutor,
highAvailabilityServices,
metricRegistry,
webMonitor.map(_.getRestAddress),
jobManagerClass,
archiveClass)
//进入startJobManagerActors,可见actor启动代码
val jobManager: ActorRef = jobManagerActorName match {
case Some(actorName) => actorSystem.actorOf(jobManagerProps, actorName)
case None => actorSystem.actorOf(jobManagerProps)
}
- 4.1根据startJobManagerActors方法参数:jobManagerClass(参数值:JobManager),找到jobmanagerActor生命周期方法(actor生命周期方法prestart、receive、postStop,其中postStop是actor停止时调用)
- 4.2进入jobManagerActor生命周期方法:preStart
- 4.2.1、启动leader选举服务,此处是standalone模式,直接赋予leader角色
leaderElectionService.start(this)
- 4.2.2 进入StandaloneLeaderElectionService.start,standalone模式,最终调用的jobmanager的grantLeadership方法
contender.grantLeadership(HighAvailabilityServices.DEFAULT_LEADER_ID);- 4.2.3 在grantLeadership中,调用了decorateMessage,通过match匹配,最终将消息GrantLeadership(leadersessionid)发送给jobmanagerActor
override def grantLeadership(newLeaderSessionID: UUID): Unit = {
self ! decorateMessage(GrantLeadership(Option(newLeaderSessionID)))
}
override def decorateMessage(message: Any): Any = {
message match {
case msg: RequiresLeaderSessionID =>
LeaderSessionMessage(leaderSessionID.orNull, super.decorateMessage(msg))
case msg => super.decorateMessage(msg)
}
}
4.2.4 jobmanagerActor接收到GrantLeadership消息后,将jobmanager赋予leader角

最低0.47元/天 解锁文章
1859





