1、简单粗暴,flink-daemon.sh脚本可知taskmanager执行类为:org.apache.flink.runtime.taskmanager.TaskManager
2、main方法里面,最主要的就是启动taskmanager
try {
SecurityUtils.getInstalledContext.runSecured(new Callable[Unit] {
override def call(): Unit = {
//运行taskmanager,记住classOf[TaskManager],这是taksManagerActor的启动类,生命周期方法在此类中
selectNetworkInterfaceAndRunTaskManager(configuration, resourceId, classOf[TaskManager])
}
})
}
3、selectNetworkInterfaceAndRunTaskManager里面主要做了三件事:
a、创建高可用服务
b、给taskmanager分配主机、端口范围
c、启动taskmanager
def selectNetworkInterfaceAndRunTaskManager(
configuration: Configuration,
resourceID: ResourceID,
taskManagerClass: Class[_ <: TaskManager])
: Unit = {
val highAvailabilityServices = HighAvailabilityServicesUtils.createHighAvailabilityServices(
configuration,
Executors.directExecutor(),
AddressResolution.TRY_ADDRESS_RESOLUTION)
//选择网络接口和端口范围
val (taskManagerHostname, actorSystemPortRange) = selectNetworkInterfaceAndPortRange(
configuration,
highAvailabilityServices)
try {
//启动taksmanager
runTaskManager(
taskManagerHostname,
resourceID,
actorSystemPortRange,
configuration,
highAvailabilityServices,
taskManagerClass)
} finally {
try {
highAvailabilityServices.close()
} catch {
case t: Throwable => LOG.warn("Could not properly stop the high availability services.", t)
}
}
}
4、进入runTaskManager方法,里面主要是根据上面分配的端口范围,找到可用的端口分配给taskmanager通信使用,然后调用重载的runTaskManager方法启动taskmanager
def runTaskManager(
taskManagerHostname: String,
resourceID: ResourceID,
actorSystemPortRange: java.util.Iterator[Integer],
configuration: Configuration,
highAvailabilityServices: HighAvailabilityServices,
taskManagerClass: Class[_ <: TaskManager])
: Unit = {
//通过创建socket,找到可用的端口
val result = AkkaUtils.retryOnBindException({
// Try all ports in the range until successful
val socket = N