3. HDFS源代码分析之DataNode BlockScanner实现

原创

已于 2025-05-05 14:55:19 修改 · 518 阅读

2 ·

CC 4.0 BY-SA版权

文章标签：

#hdfs #hadoop #大数据

于 2020-06-22 10:24:07 首次发布

本文深入剖析Hadoop中DataNode的BlockScanner机制，详细介绍了BlockScanner如何通过VolumeScanner周期性验证存储的所有数据块，以及如何处理可疑块和报告损坏块至NameNode，确保数据的完整性和可靠性。

BlockScanner实现

每个DataNode都会有一个BlockScanner周期性的验证DataNode上存储的所有数据块的正确性，并把损坏的数据块报告给NameNode。
VolumeScanner是专门针对每个存储目录做块扫描的服务，由于DataNode可以使用多目录，所以BlockScanner会持有多个VolumeScanner。

public class BlockScanner {
   
   
  ...
  // 保存VolumeScanner
  private final TreeMap<String, VolumeScanner> scanners =
      new TreeMap<String, VolumeScanner>();
  ...
	// 添加VolumeScanner对象
   public synchronized void addVolumeScanner(FsVolumeReference ref) {
   
   
	   boolean success = false;
	   try {
   
   
	     FsVolumeSpi volume = ref.getVolume();
	     if (!isEnabled()) {
   
   
	       LOG.debug("Not adding volume scanner for {}, because the block " +
	           "scanner is disabled.", volume.getBasePath());
	       return;
	     }
	     // 在map查找VolumeScanner对象
	     VolumeScanner scanner = scanners.get(volume.getStorageID());
	     if (scanner != null) {
   
   
	       LOG.error("Already have a scanner for volume {}.",
	           volume.getBasePath());
	       return;
	     }
	     LOG.debug("Adding scanner for volume {} (StorageID {})",
	         volume.getBasePath(), volume.getStorageID());
	     // 创建VolumeScanner
	     scanner = new VolumeScanner(conf, datanode, ref);
	     // 启动VolumeScanner
	     scanner.start();
	     // 将VolumeScanner加入map中
	     scanners.put(volume.getStorageID(), scanner);
	     success = true;
	   } finally {
   
   
	     if (!success) {
   
   
	       // If we didn't create a new VolumeScanner object, we don't
	       // need this reference to the volume.
	       IOUtils.cleanup(null, ref);
	     }
	 }
 }

}

VolumeScanner实现

public class VolumeScanner extends Thread {
   
   
...
  // 可疑块列表,scanner会优先扫描
  private final LinkedHashSet<ExtendedBlock> suspectBlocks =
      new LinkedHashSet<ExtendedBlock>();
...

  public void run() {
   
   
    // Record the minute on which the scanner started.
    // 记录扫描时间
    this.startMinute =
        TimeUnit.MINUTES.convert(Time.monotonicNow(), TimeUnit.MILLISECONDS);
    this.curMinute = startMinute;
    try {
   
   
      LOG.trace("{}: thread starting.", this);
      resultHandler.setup(this);
      try {
   
   
        long timeout

最低0.47元/天解锁文章