背景
Dolphinscheduler针对YARN任务,比如说MR、Spark、Flink,甚至是Shell任务,最初都是会判断如果有YARN任务,解析到applicationId。这样就会不单单以判断客户端进程为单一判断依据,还要根据YARN状态进行最终的Dolphinscheduler任务状态判断。后期,社区对此进行了重构(确实是好的向往,现在已经是半成品),但是导致了一些问题,比如说针对Flink Stream Application模式,这种客户端分离模式会让客户端Shell直接退出,所以现在Dolphinscheduler里面的任务就直接成功了。YARN上的任务还在运行呢,但Dolphinscheduler已经不能追踪到YARN上任务的状态了。
那么,想要实现对于YARN上任务的状态跟踪,可以怎么做呢?
注:以3.2.1版本为例。
Worker Task关系图
首先,让我们来看下DolphinScheduler中Worker Task的关系原理。

- AbstractTask: 主要定义了Task的基本生命周期接口,比如说init、handle和cancel
- AbstractRemoteTask: 主要对handle方法做了实现,体现了模版方法设计模式,提取了
submitApplication、trackApplicationStatus以及cancelApplication三个核心接口方法 - AbstractYarnTask: 比如说YARN任务,就抽象了
AbstractYarnTask,其中submitApplication、trackApplicationStatus以及cancelApplication可以直接是对YARN API的访问
AbstractYarnTask实现YARN状态跟踪
AbstractYarnTask可以实现YARN状态跟踪,参考org.apache.dolphinscheduler.plugin.task.api.AbstractYarnTask,完整代码如下 :
public abstract class AbstractYarnTask extends AbstractRemoteTask {
private static final int MAX_RETRY_ATTEMPTS = 3;
private ShellCommandExecutor shellCommandExecutor;
public AbstractYarnTask(TaskExecutionContext taskRequest) {
super(taskRequest);
this.shellCommandExecutor = new ShellCommandExecutor(this::logHandle, taskRequest);
}
@Override
public void submitApplication() throws TaskException {
try {
IShellInterceptorBuilder shellActuatorBuilder =
ShellInterceptorBuilderFactory.newBuilder()
.properties(getProperties())
// todo: do we need to move the replace to subclass?
.appendScript(getScript().replaceAll("\\r\\n", System.lineSeparator()));
// SHELL task exit code
TaskResponse response = shellCommandExecutor.run(shellActuatorBuilder, null);
setExitStatusCode(response.getExitStatusCode());
setAppIds(String.join(TaskConstants.COMMA, getApplicationIds()));
setProcessId(response.getProcessId());
} catch (InterruptedException ex) {
Thread.currentThread().interrupt();
log.info("The current yarn task has been interrupted", ex);
setExitStatusCode(TaskConstants.EXIT_CODE_FAILURE);
throw new TaskException("The current yarn task has been interrupted", ex);
} catch (Exception e) {
log.error("yarn process failure", e);
exitStatusCode = -1;
throw new TaskException("Execute task failed", e);
}
}
@Override
public void trackApplicationStatus() throws TaskException {
if (StringUtils.isEmpty(appIds)) {
return;
}
List<String> appIdList = Arrays.asList(appIds.split(","));
boolean continueTracking = true;
while (continueTracking) {
Map

最低0.47元/天 解锁文章
1801

被折叠的 条评论
为什么被折叠?



