Apache Flink Task类源码分析

1. 简介

Apache Flink由两类运行时JVM进程管理分布式集群的计算资源。

  • JobManager进程负责分布式任务管理,如任务调度、检查点、故障恢复等。在高可用性(HA)分布式部署时,系统存在多个JobManager,一个leader和多个standby。JobManager是Flink主从架构中的master。
  • TaskManager进程负责执行任务线程(即子任务subtask)、缓存和传输stream。TaskManager是Flink主从架构中的slave。
    processes.png

Task.java类表示在TaskManager上执行的operator subtask(子任务),这些operator subtask在不同的线程、不同的物理机或不同的容器中彼此互不依赖得执行。

每个operator subtask由一个专用的线程运行。

2. 代码分析

org.apache.flink.runtime.taskmanager.Task类在flink 1.8中有1645行,是个非常冗长的类,它实现了Runnable、TaskActions、CheckpointListener接口。

public interface TaskActions {
   

	/**
	 * Check the execution state of the execution producing a result partition.
	 *
	 * @param jobId ID of the job the partition belongs to.
	 * @param intermediateDataSetId ID of the parent intermediate data set.
	 * @param resultPartitionId ID of the result partition to check. This
	 * identifies the producing execution and partition.
	 */
	void triggerPartitionProducerStateCheck(
		JobID jobId,
		IntermediateDataSetID intermediateDataSetId,
		ResultPartitionID resultPartitionId);

	/**
	 * Fail the owning task with the given throwable.
	 *
	 * @param cause of the failure
	 */
	void failExternally(Throwable cause);
}

TaskActions接口定义了Task可以被执行的操作,目前包含两个方法:

  • triggerPartitionProducerStateCheck:检查执行状态
  • failExternally:根据输入的Throwable令当前Task失败
public interface CheckpointListener {
   

	/**
	 * This method is called as a notification once a distributed checkpoint has been completed.
	 * 
	 * Note that any exception during this method will not cause the checkpoint to
	 * fail any more.
	 * 
	 * @param checkpointId The ID of the checkpoint that has been completed.
	 * @throws Exception
	 */
	void notifyCheckpointComplete(long checkpointId) throws Exception;
}

CheckpointListener接口定义了checkpoint完成后的通知逻辑。

2.1 构造函数

    public Task(
        JobInformation jobInformation,
        TaskInformation taskInformation,
        ExecutionAttemptID executionAttemptID,
        AllocationID slotAllocationId,
        int subtaskIndex,
        int attemptNumber,
        Collection<ResultPartitionDeploymentDescriptor> resultPartitionDeploymentDescriptors,
        Collection<InputGateDeploymentDescriptor> inputGateDeploymentDescriptors,
        int targetSlotNumber,
        MemoryManager memManager,
        IOManager ioManager,
        NetworkEnvironment networkEnvironment,
        BroadcastVariableManager bcVarManager,
        TaskStateManager taskStateManager,
        TaskManagerActions taskManagerActions,
        InputSplitProvider inputSplitProvider,
        CheckpointResponder checkpointResponder,
        GlobalAggregateManager aggregateManager,
        BlobCacheService blobService,
        LibraryCacheManager libraryCache,
        FileCache fileCache,
        TaskManagerRuntimeInfo taskManagerConfig,
        @Nonnull TaskMetricGroup metricGroup,
        ResultPartitionConsumableNotifier resultPartitionConsumableNotifier,
        PartitionProducerStateChecker partitionProducerStateChecker,
        Executor executor) {
   

        Preconditions.checkNotNull(jobInformation);
        Preconditions.checkNotNull(taskInformation);

        Preconditions.checkArgument(0 <= subtaskIndex, "The subtask index must be positive.");
        Preconditions.checkArgument(0 <= attemptNumber, "The attempt number must be positive.");
        Preconditions.checkArgument(0 <= targetSlotNumber, "The target slot number must be positive.");

        this.taskInfo = new TaskInfo(
                taskInformation.getTaskName(),
                taskInformation.getMaxNumberOfSubtaks(),
                subtaskIndex,
                taskInformation.getNumberOfSubtasks(),
                attemptNumber,
                String.valueOf(slotAllocationId));

        this.jobId = jobInformation.getJobId();
        this.vertexId = taskInformation.getJobVertexId();
        this.executionId  = Preconditions.checkNotNull(executionAttemptID);
        this.allocationId = Preconditions.checkNotNull(slotAllocationId);
        this.taskNameWithSubtask = taskInfo.getTaskNameWithSubtasks();
        this.jobConfiguration = jobInformation.getJobConfiguration();
        this.taskConfiguration = taskInformation.getTaskConfiguration();
        this.requiredJarFiles = jobInformation.getRequiredJarFileBlobKeys();
        this.requiredClasspaths = jobInformation.getRequiredClasspathURLs();
        this.nameOfInvokableClass = taskInformation.getInvokableClassName();
        this.serializedExecutionConfig = jobInformation.getSerializedExecutionConfig();

        Configuration tmConfig = taskManagerConfig.getConfiguration();
        this.taskCancellationInterval = tmConfig.getLong(TaskManagerOptions.TASK_CANCELLATION_INTERVAL);
        this.taskCancellationTimeout = tmConfig.getLong(TaskManagerOptions.TASK_CANCELLATION_TIMEOUT);

        this.memoryManager = Preconditions.checkNotNull(memManager);
        this.ioManager = Preconditions.checkNotNull(ioManager);
        this.broadcastVariableManager = Preconditions.checkNotNull(bcVarManager);
        this.taskStateManager = Preconditions.checkNotNull(taskStateManager);
        this.accumulatorRegistry = new AccumulatorRegistry(jobId, executionId);

        this.inputSplitProvider = Preconditions.checkNotNull(inputSplitProvider);
        this.checkpointResponder = Preconditions.checkNotNull(checkpointResponder);
        this.aggregateManager = Preconditions.checkNotNull(aggregateManager
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值