BlockScanner实现
每个DataNode都会有一个BlockScanner周期性的验证DataNode上存储的所有数据块的正确性,并把损坏的数据块报告给NameNode。
VolumeScanner是专门针对每个存储目录做块扫描的服务,由于DataNode可以使用多目录,所以BlockScanner会持有多个VolumeScanner。
public class BlockScanner {
...
// 保存VolumeScanner
private final TreeMap<String, VolumeScanner> scanners =
new TreeMap<String, VolumeScanner>();
...
// 添加VolumeScanner对象
public synchronized void addVolumeScanner(FsVolumeReference ref) {
boolean success = false;
try {
FsVolumeSpi volume = ref.getVolume();
if (!isEnabled()) {
LOG.debug("Not adding volume scanner for {}, because the block " +
"scanner is disabled.", volume.getBasePath());
return;
}
// 在map查找VolumeScanner对象
VolumeScanner scanner = scanners.get(volume.getStorageID());
if (scanner != null) {
LOG.error("Already have a scanner for volume {}.",
volume.getBasePath());
return;
}
LOG.debug("Adding scanner for volume {} (StorageID {})",
volume.getBasePath(), volume.getStorageID());
// 创建VolumeScanner
scanner = new VolumeScanner(conf, datanode, ref);
// 启动VolumeScanner
scanner.start();
// 将VolumeScanner加入map中
scanners.put(volume.getStorageID(), scanner);
success = true;
} finally {
if (!success) {
// If we didn't create a new VolumeScanner object, we don't
// need this reference to the volume.
IOUtils.cleanup(null, ref);
}
}
}
}
VolumeScanner实现
public class VolumeScanner extends Thread {
...
// 可疑块列表,scanner会优先扫描
private final LinkedHashSet<ExtendedBlock> suspectBlocks =
new LinkedHashSet<ExtendedBlock>();
...
public void run() {
// Record the minute on which the scanner started.
// 记录扫描时间
this.startMinute =
TimeUnit.MINUTES.convert(Time.monotonicNow(), TimeUnit.MILLISECONDS);
this.curMinute = startMinute;
try {
LOG.trace("{}: thread starting.", this);
resultHandler.setup(this);
try {
long timeout

本文深入剖析Hadoop中DataNode的BlockScanner机制,详细介绍了BlockScanner如何通过VolumeScanner周期性验证存储的所有数据块,以及如何处理可疑块和报告损坏块至NameNode,确保数据的完整性和可靠性。
最低0.47元/天 解锁文章
639

被折叠的 条评论
为什么被折叠?



