目录
https://issues.apache.org/jira/browse/YARN-1336
版本不一致可能存在的问题
https://issues.apache.org/jira/browse/YARN-5630
nm重启
NMNullStateStoreService:功能未开启时候使用这个实例
NMLeveldbStateStoreService:功能开启后使用这个实例
nm重启利用将状态存储到levelDB库中,存储在磁盘上,在启动时候读取状态,重新和app等关联起来
重启不会导致container马上失败,但等心跳超时(yarn.nodemanager.health-checker.interval-ms=600s)后将在nm上运行的Container标记为dead状态,并在其他的正常的节点上进行调度。FairScheduler remove 这个节点,不会再为这个nm分配任何容器。
nm退役
nm利用黑名单文件,当使用refreshNodes命令时候,会将节点标记为Decommission
1.直接退役
nm直接关闭,所有容器全部退出
2.优雅退役
nm不直接关闭,等到所有容器运行结束后(在一定范围内yarn.resourcemanager.nodemanager-graceful-decommission-timeout-secs,默认1小时)退出
nm重启配置
<property>
<name>yarn.nodemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.recovery.dir</name>
<value>/xxxx/yarn-nm-recovery</value>
</property>
<!--该配置开启后,nm关闭时不会马上关闭容器
该配置不开启,nm关闭时会马上关闭容器(除非正在滚动升级)
-->
<property>
<name>yarn.nodemanager.recovery.supervised</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.address</name>
<value>${yarn.nodemanager.hostname}:port</value>
</property>
1.目录设置好
2022-03-01 17:43:39,562 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService failed in state INITED; cause: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /xxxxxx/yarn-nm-recovery/yarn-nm-state/LOCK: No such file or directory
org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /xxxx/yarn-nm-recovery/yarn-nm-state/LOCK: No such file or directory
levelDB
sst扩展名的文件是LevelDB的磁盘数据文件;
log扩展名的文件是操作日志文件,记录了最近一段时间的操作日志;
CURRENT文件内容只有一个:当前 MANIFEST 的文件名,LevelDB 首先读取 CURRENT 文件才知道哪个MANIFEST文件是有效文件;
LOCK文件防止其它进程意外对这个目录文件进行读写操作,进程结束时,锁就会自动释放;
LOG文件记录了数据库的一系列关键性操作日志,例如每一次 Minor 和 Major Compaction 的相关信息;
MANIFEST 文件存放所有文件的Key取值范围、层级和其它元信息,MANIFEST文件的后缀为版本号。每一次重新打开数据库,都会生成一个新的 MANIFEST 文件,具有不同的版本号,然后还需要将老的 MANIFEST 文件删除;
LevelDB 的删除操作并不是真的立即删除键值对,而是将删除操作转换成了更新操作写进去了一个特殊的键值对,这个键值对的值部分是一个特殊的删除标记,待 LevelDB 在某种条件下触发数据合并(compact)时才会真的删除相应的键值对;
实例日志
当容器运行时候,关闭nm,容器依然在运行,运行完毕,进程消失,map显示完成度100%,am还存在,当nm重启后,am汇报完成,退出
日志如下
大概意思就是nm初始化的时候进行恢复操作,恢复应用信息,恢复容器的信息,与之前对应的app、进程等关联,恢复log等等;
将app、容器状态转换为running,如果发现运行完成了,就开始关闭对应容器;
nm启动时向rm注册,获取nm的ContainerStatuses等等
2022-03-02 19:42:37,469 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.RecoveredContainerLaunch: Recovered container container_e38_1645523508350_0038_01_000002 succeeded
2022-03-02 19:42:39,895 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out 2 NM container statuses: [[container_e38_1645523508350_0038_01_000001, CreateTime: xxxxxxxx
2022-03-02 19:42:39,901 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: container_e38_1645523508350_0038_01_000001's xxxxxxxx
2022-03-02 19:42:39,904 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Registering with RM using containers :[[container_e38_1645523508350_0038_01_000001, CreateTime: xxxxxx
am所在的节点存活
当任务运行的时候将nm关闭,kill掉容器,nm启动后加载容器状态,觉得它失败了,在其他的节点上启动
2022-03-02 20:33:50,173 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_e38_1645523508350_0039_01_000002 transitioned from RUNNING to EXITED_WITH_FAILURE
当任务运行的时候将nm关闭,kill掉容器,nm不启动,rm认为节点丢失了,am创新启动一个container在其他节点上
AttemptID:attempt_1645523508350_0040_m_000000_0 Timed out after 600 secs Container released on a *lost* node cleanup failed for container container_e38_1645523508350_0040_01_000002 : java.net.ConnectException: Call From xxxxxx failed on connection exception: java.net.ConnectException: Connection refused;
当任务运行的时候将nm关闭,不kill掉容器,reduce容器失败,去fetch shuffle数据的时候异常,nm启动秒好
22/03/03 11:06:20 INFO mapreduce.Job: Task Id : attempt_1645523508350_0041_r_000000_1, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#2
2022-03-03 11:03:13,875 WARN [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to xxx with 1 map outputs
java.net.ConnectException: Connection refused (Connection refused)
当任务运行的时候将nm关闭,不kill掉容器,reduce尝试shuffle失败,等nm成为lostnode后,container在另外的节点上启动,任务完成
cleanup failed for container container_e38_1645523508350_0043_01_000002 : java.net.ConnectException: Call From xxx failed on connection exception: java.net.ConnectException: Connection refused;
am所在的节点关闭
当任务运行的时候将nm关闭,不kill掉am,任务直接完成
当任务运行的时候将nm关闭,kill掉am,无限次尝试连接节点,当节点上线时候,在其他节点上重新执行任务
22/03/03 13:31:46 INFO ipc.Client: Retrying connect to server: xxx. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
am/container所在节点关闭
当任务运行的时候将nm关闭,不kill am和contianer,reduce尝试shuffle失败,等nm成为lostnode后,am/container在另外的节点上启动,任务完成
退役方式
当任务运行的时候,退役方式关闭nm,容器直接关闭
并且levelDB库目录被清空
优雅退役
当任务运行的时候,优雅退役方式关闭nm,容器运行后关闭,nm等所有容器运行完毕后关闭
并且levelDB库目录被清空
源码解析
初始化
nodemanager启动
serviceInit方法中initAndStartRecoveryStore(conf);
private void initAndStartRecoveryStore(Configuration conf)
throws IOException {
// 获取conf中的yarn.nodemanager.recovery.enabled,true表示开启nm 重启功能
boolean recoveryEnabled = conf.getBoolean(
YarnConfiguration.NM_RECOVERY_ENABLED,
YarnConfiguration.DEFAULT_NM_RECOVERY_ENABLED);
if (recoveryEnabled) {
FileSystem recoveryFs = FileSystem.getLocal(conf);
// 获取conf中的yarn.nodemanager.recovery.dir
String recoveryDirName = conf.get(YarnConfiguration.NM_RECOVERY_DIR);
if (recoveryDirName == null) {
throw new IllegalArgumentException("Recovery is enabled but " +
YarnConfiguration.NM_RECOVERY_DIR + " is not set.");
}
Path recoveryRoot = new Path(recoveryDirName);
// 创建状态存储目录,权限700
recoveryFs.mkdirs(recoveryRoot, new FsPermission((short)0700));
// 初始化NMLeveldbStateStoreService
nmStore = new NMLeveldbStateStoreService();
} else {
nmStore = new NMNullStateStoreService();
}
nmStore.init(conf);
// 调用NMLeveldbStateStoreService的父类NMStateStoreService中的方法,
// 在该方法中调用了NMLeveldbStateStoreService的startStorage(),该方法是空方法
nmStore.start();
}
nmStore.init(conf)的serviceInit
实际上调用NMLeveldbStateStoreService的父类NMStateStoreService中的方法
在该方法中调用了NMLeveldbStateStoreService的initStorage(conf);
protected void initStorage(Configuration conf)
throws IOException {
// 获取levelDB实例
db = openDatabase(conf);
// 判断CURRENT_VERSION_INFO,也就是判断state-store的兼容性
// 兼容视为小版本更新,不兼容视为大版本更新
checkVersion();
// 获取yarn.nodemanager.recovery.compaction-interval-secs,默认3600s,启动一个定时调度器,调度器中做db.compactRange(null, null);
// 调用该方法时,levelDb进行合并,删除历史(特殊)键值对,与levelDb的删除机制有关
startCompactionTimer(conf);
}
db = openDatabase(conf);
protected DB openDatabase(Configuration conf) throws IOException {
// 创建目录/xxx/yarn-nm-recovery/yarn-nm-state
// /xxxx/yarn-nm-recovery/nm-aux-services/mapreduce_shuffle/mapreduce_shuffle_state
Path storeRoot = createStorageDir(conf);
Options options = new Options();
options.createIfMissing(false);
options.logger(new LeveldbLogger());
LOG.info("Using state database at " + storeRoot + " for recovery");
File dbfile = new File(storeRoot.toString());
try {
// 打开数据库
db = JniDBFactory.factory.open(dbfile, options);
} catch (NativeDB.DBException e) {
// 没有则创建
if (e.isNotFound() || e.getMessage().contains(" does not exist ")) {
LOG.info("Creating state database at " + dbfile);
isNewlyCreated = true;
options.createIfMissing(true);
try {
db = JniDBFactory.factory.open(dbfile, options);
// 插入key为nm-schema-version,value为版本协议,经过解析后得到的是版本号1.2
storeVersion();
} catch (DBException dbErr) {
throw new IOException(dbErr.getMessage(), dbErr);
}
} else {
throw e;
}
}
return db;
}
checkVersion();
protected void checkVersion() throws IOException {
// 从levelDB中加载版本协议数据,解析后得到版本号
Version loadedVersion = loadVersion();
LOG.info("Loaded NM state version info " + loadedVersion);
// 与系统的版本号对比,相等无事发生
if (loadedVersion.equals(getCurrentVersion())) {
return;
}
// 如果大版本相等也无事发生
if (loadedVersion.isCompatibleTo(getCurrentVersion())) {
LOG.info("Storing NM state version info " + getCurrentVersion());
// 将系统的版本协议存入levelDB中,也就是替换之前的版本协议
storeVersion();
} else {
// 如果大版本不想等,抛出异常让用户单独使用升级工具或者删除不兼容状态
throw new IOException(
"Incompatible version for NM state: expecting NM state version "
+ getCurrentVersion() + ", but loading version " + loadedVersion);
}
}
存储
容器启动命令是由各个ApplicationMaster通过RPC函数ContainerManagementProtocol#startContainer向NodeManager发起的,NodeManager中的ContainerManager组件(组件实现为ContainerManagerImpl) 负责接收并处理该请求。
startContainers
@Override
public StartContainersResponse startContainers(
StartContainersRequest requests) throws YarnException, IOException {
// 启动容器
startContainerInternal(containerTokenIdentifier, request);
}
protected void startContainerInternal(
ContainerTokenIdentifier containerTokenIdentifier,
StartContainerRequest request) throws YarnException, IOException {
// 如果找不到applicationID就创建一个,将app的状态存储起来
if (!context.getApplications().containsKey(applicationID)) {
// Create the application
// populate the flow context from the launch context if the timeline
// service v.2 is enabled
Application application =
new ApplicationImpl(dispatcher, user, flowContext,
applicationID, credentials, context);
// key是ContainerManager/applications/appId,value是ContainerManagerApplicationProto
context.getNMStateStore().storeApplication(applicationID,
buildAppProto(applicationID, user, credentials, appAcls,
logAggregationContext, flowContext));
}
// 存储容器的状态
this.context.getNMStateStore().storeContainer(containerId,
containerTokenIdentifier.getVersion(), containerStartTime, request);
}
看一下value有啥
// ContainerManagerApplicationProto set Id/User/LogAggregationContext/Credentials/Acls等等
static ContainerManagerApplicationProto buildAppProto(ApplicationImpl app)
throws IOException {
ContainerManagerApplicationProto.Builder builder =
ContainerManagerApplicationProto.newBuilder();
builder.setId(((ApplicationIdPBImpl) app.appId).getProto());
builder.setUser(app.getUser());
if (app.logAggregationContext != null) {
builder.setLogAggregationContext((
(LogAggregationContextPBImpl)app.logAggregationContext).getProto());
}
builder.clearCredentials();
if (app.credentials != null) {
DataOutputBuffer dob = new DataOutputBuffer();
app.credentials.writeTokenStorageToStream(dob);
builder.setCredentials(ByteString.copyFrom(dob.getData()));
}
builder.clearAcls();
if (app.applicationACLs != null) {
for (Map.Entry<ApplicationAccessType, String> acl : app
.applicationACLs.entrySet()) {
YarnProtos.ApplicationACLMapProto p = YarnProtos
.ApplicationACLMapProto.newBuilder()
.setAccessType(ProtoUtils.convertToProtoFormat(acl.getKey()))
.setAcl(acl.getValue())
.build();
builder.addAcls(p);
}
}
builder.setAppLogAggregationInitedTime(app.applicationLogInitedTimestamp);
builder.clearFlowContext();
if (app.flowContext != null && app.flowContext.getFlowName() != null
&& app.flowContext.getFlowVersion() != null) {
FlowContextProto fcp = FlowContextProto.newBuilder()
.setFlowName(app.flowContext.getFlowName())
.setFlowVersion(app.flowContext.getFlowVersion())
.setFlowRunId(app.flowContext.getFlowRunId()).build();
builder.setFlowContext(fcp);
}
return builder.build();
}
// optional .hadoop.yarn.ApplicationIdProto id = 1;
// optional string user = 2;
// optional bytes credentials = 3;
// repeated .hadoop.yarn.ApplicationACLMapProto acls = 4;
// optional .hadoop.yarn.LogAggregationContextProto log_aggregation_context = 5;
// optional int64 appLogAggregationInitedTime = 6 [default = -1];
// optional .hadoop.yarn.FlowContextProto flowContext = 7;
storeApplication
@Override
public void storeApplication(ApplicationId appId,
ContainerManagerApplicationProto p) throws IOException {
if (LOG.isDebugEnabled()) {
LOG.debug("storeApplication: appId=" + appId
+ ", proto=" + p);
}
// key是ContainerManager/applications/appId,value是ContainerManagerApplicationProto
String key = APPLICATIONS_KEY_PREFIX + appId;
try {
db.put(bytes(key), p.toByteArray());
} catch (DBException e) {
throw new IOException(e);
}
}
删除
removeApplication
application运行结束后删除对应的键值对
@Override
public void removeApplication(ApplicationId appId)
throws IOException {
if (LOG.isDebugEnabled()) {
LOG.debug("removeApplication: appId=" + appId);
}
try {
WriteBatch batch = db.createWriteBatch();
try {
String key = APPLICATIONS_KEY_PREFIX + appId;
batch.delete(bytes(key));
db.write(batch);
} finally {
batch.close();
}
} catch (DBException e) {
throw new IOException(e);
}
}
storeContainer
@Override
public void storeContainer(ContainerId containerId, int containerVersion,
long startTime, StartContainerRequest startRequest) throws IOException {
String idStr = containerId.toString();
if (LOG.isDebugEnabled()) {
LOG.debug("storeContainer: containerId= " + idStr
+ ", startRequest= " + startRequest);
}
String keyRequest = getContainerKey(idStr, CONTAINER_REQUEST_KEY_SUFFIX);
String keyVersion = getContainerVersionKey(idStr);
String keyStartTime =
getContainerKey(idStr, CONTAINER_START_TIME_KEY_SUFFIX);
try {
WriteBatch batch = db.createWriteBatch();
try {
// key是ContainerManager/containers/containerId/request value是StartContainerRequest
batch.put(bytes(keyRequest),
((StartContainerRequestPBImpl) startRequest).getProto().
toByteArray());
// key是ContainerManager/containers/starttime,value是启动时间
batch.put(bytes(keyStartTime), bytes(Long.toString(startTime)));
if (containerVersion != 0) {
// key是containerId/version,value是版本号
batch.put(bytes(keyVersion),
bytes(Integer.toString(containerVersion)));
}
db.write(batch);
} finally {
batch.close();
}
} catch (DBException e) {
throw new IOException(e);
}
}
removeApplication
当任务结束,删除levelDB的数据
@Override
public void removeApplication(ApplicationId appId)
throws IOException {
if (LOG.isDebugEnabled()) {
LOG.debug("removeApplication: appId=" + appId);
}
try {
WriteBatch batch = db.createWriteBatch();
try {
String key = APPLICATIONS_KEY_PREFIX + appId;
batch.delete(bytes(key));
db.write(batch);
} finally {
batch.close();
}
} catch (DBException e) {
throw new IOException(e);
}
}
removeContainer
当容器运行完毕,删除所有相关的levelDB数据
@Override
public void removeContainer(ContainerId containerId)
throws IOException {
if (LOG.isDebugEnabled()) {
LOG.debug("removeContainer: containerId=" + containerId);
}
String keyPrefix = CONTAINERS_KEY_PREFIX + containerId.toString();
try {
WriteBatch batch = db.createWriteBatch();
try {
batch.delete(bytes(keyPrefix + CONTAINER_REQUEST_KEY_SUFFIX));
batch.delete(bytes(keyPrefix + CONTAINER_DIAGS_KEY_SUFFIX));
batch.delete(bytes(keyPrefix + CONTAINER_LAUNCHED_KEY_SUFFIX));
batch.delete(bytes(keyPrefix + CONTAINER_QUEUED_KEY_SUFFIX));
batch.delete(bytes(keyPrefix + CONTAINER_PAUSED_KEY_SUFFIX));
batch.delete(bytes(keyPrefix + CONTAINER_KILLED_KEY_SUFFIX));
batch.delete(bytes(keyPrefix + CONTAINER_EXIT_CODE_KEY_SUFFIX));
batch.delete(bytes(keyPrefix + CONTAINER_UPDATE_TOKEN_SUFFIX));
List<String> unknownKeysForContainer = containerUnknownKeySuffixes
.removeAll(containerId);
for (String unknownKeySuffix : unknownKeysForContainer) {
batch.delete(bytes(keyPrefix + unknownKeySuffix));
}
db.write(batch);
} finally {
batch.close();
}
} catch (DBException e) {
throw new IOException(e);
}
}
恢复
当nm重启时,ContainerManagerImpl服务创新初始化初始化时,recover();
private void recover() throws IOException, URISyntaxException {
NMStateStoreService stateStore = context.getNMStateStore();
if (stateStore.canRecover()) {
// 加载本地状态
rsrcLocalizationSrvc.recoverLocalizedResources(
stateStore.loadLocalizationState());
// 加载app状态
RecoveredApplicationsState appsState = stateStore.loadApplicationsState();
for (ContainerManagerApplicationProto proto :
appsState.getApplications()) {
if (LOG.isDebugEnabled()) {
LOG.debug("Recovering application with state: " + proto.toString());
}
// 恢复app
recoverApplication(proto);
}
// 加载容器状态
for (RecoveredContainerState rcs : stateStore.loadContainersState()) {
if (LOG.isDebugEnabled()) {
LOG.debug("Recovering container with state: " + rcs);
}
// 恢复容器
recoverContainer(rcs);
}
// Recovery AMRMProxy state after apps and containers are recovered
if (this.amrmProxyEnabled) {
this.getAMRMProxyService().recover();
}
//Dispatching the RECOVERY_COMPLETED event through the dispatcher
//so that all the paused, scheduled and queued containers will
//be scheduled for execution on availability of resources.
dispatcher.getEventHandler().handle(
new ContainerSchedulerEvent(null,
ContainerSchedulerEventType.RECOVERY_COMPLETED));
} else {
LOG.info("Not a recoverable state store. Nothing to recover.");
}
加载Applications状态
通过LeveldbIterator和key前缀ContainerManager/applications/来加载
@Override
public RecoveredApplicationsState loadApplicationsState()
throws IOException {
RecoveredApplicationsState state = new RecoveredApplicationsState();
state.applications = new ArrayList<ContainerManagerApplicationProto>();
String keyPrefix = APPLICATIONS_KEY_PREFIX;
LeveldbIterator iter = null;
try {
iter = new LeveldbIterator(db);
iter.seek(bytes(keyPrefix));
while (iter.hasNext()) {
Entry<byte[], byte[]> entry = iter.next();
String key = asString(entry.getKey());
if (!key.startsWith(keyPrefix)) {
break;
}
state.applications.add(
ContainerManagerApplicationProto.parseFrom(entry.getValue()));
}
} catch (DBException e) {
throw new IOException(e);
} finally {
if (iter != null) {
iter.close();
}
}
// 删除过期完成的app
cleanupDeprecatedFinishedApps();
return state;
}
恢复app状态,将加载的app的value,重新构建出来
private void recoverApplication(ContainerManagerApplicationProto p)
throws IOException {
ApplicationId appId = new ApplicationIdPBImpl(p.getId());
Credentials creds = new Credentials();
creds.readTokenStorageStream(
new DataInputStream(p.getCredentials().newInput()));
List<ApplicationACLMapProto> aclProtoList = p.getAclsList();
Map<ApplicationAccessType, String> acls =
new HashMap<ApplicationAccessType, String>(aclProtoList.size());
for (ApplicationACLMapProto aclProto : aclProtoList) {
acls.put(ProtoUtils.convertFromProtoFormat(aclProto.getAccessType()),
aclProto.getAcl());
}
LogAggregationContext logAggregationContext = null;
if (p.getLogAggregationContext() != null) {
logAggregationContext =
new LogAggregationContextPBImpl(p.getLogAggregationContext());
}
FlowContext fc = null;
if (p.getFlowContext() != null) {
FlowContextProto fcp = p.getFlowContext();
fc = new FlowContext(fcp.getFlowName(), fcp.getFlowVersion(),
fcp.getFlowRunId());
if (LOG.isDebugEnabled()) {
LOG.debug(
"Recovering Flow context: " + fc + " for an application " + appId);
}
} else {
// in upgrade situations, where there is no prior existing flow context,
// default would be used.
fc = new FlowContext(TimelineUtils.generateDefaultFlowName(null, appId),
YarnConfiguration.DEFAULT_FLOW_VERSION, appId.getClusterTimestamp());
if (LOG.isDebugEnabled()) {
LOG.debug(
"No prior existing flow context found. Using default Flow context: "
+ fc + " for an application " + appId);
}
}
LOG.info("Recovering application " + appId);
ApplicationImpl app = new ApplicationImpl(dispatcher, p.getUser(), fc,
appId, creds, context, p.getAppLogAggregationInitedTime());
context.getApplications().put(appId, app);
app.handle(new ApplicationInitEvent(appId, acls, logAggregationContext));
}
加载容器状态,同理,加载levelDB库内状态
loadContainersState
@Override
public List<RecoveredContainerState> loadContainersState()
throws IOException {
ArrayList<RecoveredContainerState> containers =
new ArrayList<RecoveredContainerState>();
ArrayList<ContainerId> containersToRemove =
new ArrayList<ContainerId>();
LeveldbIterator iter = null;
try {
iter = new LeveldbIterator(db);
iter.seek(bytes(CONTAINERS_KEY_PREFIX));
while (iter.hasNext()) {
Entry<byte[],byte[]> entry = iter.peekNext();
String key = asString(entry.getKey());
if (!key.startsWith(CONTAINERS_KEY_PREFIX)) {
break;
}
int idEndPos = key.indexOf('/', CONTAINERS_KEY_PREFIX.length());
if (idEndPos < 0) {
throw new IOException("Unable to determine container in key: " + key);
}
ContainerId containerId = ContainerId.fromString(
key.substring(CONTAINERS_KEY_PREFIX.length(), idEndPos));
String keyPrefix = key.substring(0, idEndPos+1);
RecoveredContainerState rcs = loadContainerState(containerId,
iter, keyPrefix);
// Don't load container without StartContainerRequest
// 将无需启动的contianer移除掉,仅恢复需要恢复的容器信息
if (rcs.startRequest != null) {
containers.add(rcs);
} else {
containersToRemove.add(containerId);
}
}
} catch (DBException e) {
throw new IOException(e);
} finally {
if (iter != null) {
iter.close();
}
}
// remove container without StartContainerRequest
for (ContainerId containerId : containersToRemove) {
LOG.warn("Remove container " + containerId +
" with incomplete records");
try {
removeContainer(containerId);
// TODO: kill and cleanup the leaked container
} catch (IOException e) {
LOG.error("Unable to remove container " + containerId +
" in store", e);
}
}
return containers;
}
loadContainerState,加载container的键值对,根据数据获取对应状态
private RecoveredContainerState loadContainerState(ContainerId containerId,
LeveldbIterator iter, String keyPrefix) throws IOException {
RecoveredContainerState rcs = new RecoveredContainerState();
rcs.status = RecoveredContainerStatus.REQUESTED;
while (iter.hasNext()) {
Entry<byte[],byte[]> entry = iter.peekNext();
String key = asString(entry.getKey());
if (!key.startsWith(keyPrefix)) {
break;
}
iter.next();
String suffix = key.substring(keyPrefix.length()-1); // start with '/'
if (suffix.equals(CONTAINER_REQUEST_KEY_SUFFIX)) {
rcs.startRequest = new StartContainerRequestPBImpl(
StartContainerRequestProto.parseFrom(entry.getValue()));
} else if (suffix.equals(CONTAINER_VERSION_KEY_SUFFIX)) {
rcs.version = Integer.parseInt(asString(entry.getValue()));
} else if (suffix.equals(CONTAINER_START_TIME_KEY_SUFFIX)) {
rcs.setStartTime(Long.parseLong(asString(entry.getValue())));
} else if (suffix.equals(CONTAINER_DIAGS_KEY_SUFFIX)) {
rcs.diagnostics = asString(entry.getValue());
} else if (suffix.equals(CONTAINER_QUEUED_KEY_SUFFIX)) {
if (rcs.status == RecoveredContainerStatus.REQUESTED) {
rcs.status = RecoveredContainerStatus.QUEUED;
}
} else if (suffix.equals(CONTAINER_PAUSED_KEY_SUFFIX)) {
if ((rcs.status == RecoveredContainerStatus.LAUNCHED)
||(rcs.status == RecoveredContainerStatus.QUEUED)
||(rcs.status == RecoveredContainerStatus.REQUESTED)) {
rcs.status = RecoveredContainerStatus.PAUSED;
}
} else if (suffix.equals(CONTAINER_LAUNCHED_KEY_SUFFIX)) {
if ((rcs.status == RecoveredContainerStatus.REQUESTED)
|| (rcs.status == RecoveredContainerStatus.QUEUED)
||(rcs.status == RecoveredContainerStatus.PAUSED)) {
rcs.status = RecoveredContainerStatus.LAUNCHED;
}
} else if (suffix.equals(CONTAINER_KILLED_KEY_SUFFIX)) {
rcs.killed = true;
} else if (suffix.equals(CONTAINER_EXIT_CODE_KEY_SUFFIX)) {
rcs.status = RecoveredContainerStatus.COMPLETED;
rcs.exitCode = Integer.parseInt(asString(entry.getValue()));
} else if (suffix.equals(CONTAINER_UPDATE_TOKEN_SUFFIX)) {
ContainerTokenIdentifierProto tokenIdentifierProto =
ContainerTokenIdentifierProto.parseFrom(entry.getValue());
Token currentToken = rcs.getStartRequest().getContainerToken();
Token updatedToken = Token
.newInstance(tokenIdentifierProto.toByteArray(),
ContainerTokenIdentifier.KIND.toString(),
currentToken.getPassword().array(), currentToken.getService());
rcs.startRequest.setContainerToken(updatedToken);
rcs.capability = new ResourcePBImpl(tokenIdentifierProto.getResource());
rcs.version = tokenIdentifierProto.getVersion();
} else if (suffix.equals(CONTAINER_REMAIN_RETRIES_KEY_SUFFIX)) {
rcs.setRemainingRetryAttempts(
Integer.parseInt(asString(entry.getValue())));
} else if (suffix.equals(CONTAINER_WORK_DIR_KEY_SUFFIX)) {
rcs.setWorkDir(asString(entry.getValue()));
} else if (suffix.equals(CONTAINER_LOG_DIR_KEY_SUFFIX)) {
rcs.setLogDir(asString(entry.getValue()));
} else {
LOG.warn("the container " + containerId
+ " will be killed because of the unknown key " + key
+ " during recovery.");
containerUnknownKeySuffixes.put(containerId, suffix);
rcs.setRecoveryType(RecoveredContainerType.KILL);
}
}
return rcs;
}
恢复container
private void recoverContainer(RecoveredContainerState rcs)
throws IOException {
StartContainerRequest req = rcs.getStartRequest();
ContainerLaunchContext launchContext = req.getContainerLaunchContext();
ContainerTokenIdentifier token = null;
if(rcs.getCapability() != null) {
ContainerTokenIdentifier originalToken =
BuilderUtils.newContainerTokenIdentifier(req.getContainerToken());
token = new ContainerTokenIdentifier(originalToken.getContainerID(),
originalToken.getVersion(), originalToken.getNmHostAddress(),
originalToken.getApplicationSubmitter(), rcs.getCapability(),
originalToken.getExpiryTimeStamp(), originalToken.getMasterKeyId(),
originalToken.getRMIdentifier(), originalToken.getPriority(),
originalToken.getCreationTime(),
originalToken.getLogAggregationContext(),
originalToken.getNodeLabelExpression(),
originalToken.getContainerType(), originalToken.getExecutionType());
} else {
token = BuilderUtils.newContainerTokenIdentifier(req.getContainerToken());
}
ContainerId containerId = token.getContainerID();
ApplicationId appId =
containerId.getApplicationAttemptId().getApplicationId();
LOG.info("Recovering " + containerId + " in state " + rcs.getStatus()
+ " with exit code " + rcs.getExitCode());
Application app = context.getApplications().get(appId);
// 如果app还在
if (app != null) {
// 恢复container
recoverActiveContainer(launchContext, token, rcs);
// 如果type是kill那就将container kill掉
if (rcs.getRecoveryType() == RecoveredContainerType.KILL) {
dispatcher.getEventHandler().handle(
new ContainerKillEvent(containerId, ContainerExitStatus.ABORTED,
"Due to invalid StateStore info container was killed"
+ " during recovery"));
}
} else {
// 没有对应的app
if (rcs.getStatus() != RecoveredContainerStatus.COMPLETED) {
LOG.warn(containerId + " has no corresponding application!");
}
// 如果container COMPLETED,那么会进行关闭
LOG.info("Adding " + containerId + " to recently stopped containers");
nodeStatusUpdater.addCompletedContainer(containerId);
}
}
关闭
nm关闭时
@Override
protected void serviceStop() throws Exception {
if (isStopping.getAndSet(true)) {
return;
}
try {
super.serviceStop();
DefaultMetricsSystem.shutdown();
} finally {
// YARN-3641: NM's services stop get failed shouldn't block the
// release of NMLevelDBStore.
stopRecoveryStore();
}
}
stopRecoveryStore()调用nmStore.stop(),实际上调用的是NMStateStoreService的closeStorage()
protected void closeStorage() throws IOException {
// 关闭定时任务
if (compactionTimer != null) {
compactionTimer.cancel();
compactionTimer = null;
}
// 关闭db
if (db != null) {
db.close();
}
}