append操作在namenode这端主要逻辑在FSNameSystem的appendFileInternal函数中处理,内部会调用
前言
在DFSClient写文件的时候,对于文件的每一个block,生成一个pipeline,然后按照这个pipeline进行数据传输,但是可能在数据传输过程中,DFSClient发生中断,例如断网等,此时该block在NameNode中处于UnderConstruction状态,该block所在的文件的写锁(通过LeaseManager实现)被该DFSClient占据着,无法释放。为此LeaseManager会对超过一定时间不活跃的DFSClient所占用的文件进行Recovery。
同时,BlockInfoContiguousUnderConstruction类有一个成员变量replicas用于存放“expected DataNodes which make up of the pipeline for that Block”. 该成员变量是在pipeline生成之后构造的,也就是说它并不用于指导生成pipeline,它的用途即用于Recovery。需要说明的是,在BlockInfoContiguousUnderConstruction的基类BlockInfoContiguous中有一个变量triplets用于存在block所有的replication所处的DataNodeStorageInfo。replicas记录的是"expected DataNodes",而triplets记录的是“stored DataNodes”。 下面主要介绍recovery的具体过程。
lease recovery
hdfs lease是为了实现一个文件在一个时刻只能被一个客户端写。客户端写文件或者append之前都需要向namenode申请这个文件的lease,在客户端写数据的过程中,客户端中的单独线程会不断的renew lease,不断的延长独占写的时间.
lease有两个limit,一个是soft limit,默认60s,一个是hard limit,默认1小时。lease soft limit过期之前,该客户端拥有对这个文件的独立访问权,其他客户端不能剥夺该客户端独占写这个文件的权利。lease soft limit过期后,任何一个客户端都可以回lease,继而得到这个文件的lease,获得对这个文件的独占访问权。lease hard limit过期后,namenode强制关闭文件,撤销lease.
考虑客户端写文件的过程中宕机,那么在lease soft limit过期之前,其他的客户端不能写这个文件,等到lease soft limit过期后,其他客户端可以写这个文件,在写文件之前,会首先检查文件是不是没有关闭,如果没有,那么就会进入lease recovery和block recovery阶段,这个阶段的目的是使文件的最后一个block的所有副本数据达到一致,因为客户端写block的多个副本是pipeline写,pipeline中的副本数据不一致很正常。
模拟场景:客户端写文件过程中客户端进程断掉,然后重新启动新客户端对文件进行append操作。
FileSystem fs = FileSystem.get(configuration);
FSDataOutputStream out = fs.append(path);
out.write(byte[]);
append操作在namenode这端主要逻辑在FSNameSystem的appendFileInternal函数中处理,内部会调用
// Opening an existing file for append - may need to recover lease.
recoverLeaseInternal(RecoverLeaseOp.APPEND_FILE,
iip, src, holder, clientMachine, false);
recoverLeaseInternal方法主要检查是否需要首先对文件进行lease recovery
boolean recoverLeaseInternal(RecoverLeaseOp op, INodesInPath iip,
String src, String holder, String clientMachine, boolean force)
throws IOException {
assert hasWriteLock();
INodeFile file = iip.getLastINode().asFile();
if (file.isUnderConstruction()) {
//
// If the file is under construction , then it must be in our
// leases. Find the appropriate lease record.
//
Lease lease = leaseManager.getLease(holder);
if (!force && lease != null) {
Lease leaseFile = leaseManager.getLeaseByPath(src);
if (leaseFile != null && leaseFile.equals(lease)) {
// We found the lease for this file but the original
// holder is trying to obtain it again.
throw new AlreadyBeingCreatedException(
op.getExceptionMessage(src, holder, clientMachine,
holder + " is already the current lease holder."));
}
}
//
// Find the original holder.
//
FileUnderConstructionFeature uc = file.getFileUnderConstructionFeature();
String clientName = uc.getClientName();
lease = leaseManager.getLease(clientName);
if (lease == null) {
throw new AlreadyBeingCreatedException(
op.getExceptionMessage(src, holder, clientMachine,
"the file is under construction but no leases found."));
}
if (force) {
// close now: no need to wait for soft lease expiration and
// close only the file src
LOG.info("recoverLease: " + lease + ", src=" + src +
" from client " + clientName);
return internalReleaseLease(lease, src, iip, holder);
} else {
assert lease.getHolder().equals(clientName) :
"Current lease holder " + lease.getHolder() +
" does not match file creator " + clientName;
//
// If the original holder has not renewed in the last SOFTLIMIT
// period, then start lease recovery.
//
if (lease.expiredSoftLimit()) {
LOG.info("startFile: recover " + lease + ", src=" + src + " client "
+ clientName);
if (internalReleaseLease(lease, src, iip, null)) {
return true;
} else {
throw new RecoveryInProgressException(
op.getExceptionMessage(src, holder, clientMachine,
"lease recovery is in progress. Try again later."));
}
} else {
final BlockInfoContiguous lastBlock = file.getLastBlock();
if (lastBlock != null
&& lastBlock.getBlockUCState() == BlockUCState.UNDER_RECOVERY) {
throw new RecoveryInProgressException(
op.getExceptionMessage(src, holder, clientMachine,
"another recovery is in progress by "
+ clientName + " on " + uc.getClientMachine()));
} else {
throw new AlreadyBeingCreatedException(
op.getExceptionMessage(src, holder, clientMachine,
"this file lease is currently owned by "
+ clientName + " on " + uc

本文详细解析了HDFS中Lease Recovery与Block Recovery的机制,阐述了当客户端写操作中断时,HDFS如何确保数据一致性,特别是在Lease超时、Block副本不一致情况下的处理流程。
最低0.47元/天 解锁文章
2889

被折叠的 条评论
为什么被折叠?



