HBase 源码学习 ---- Flush （3）

最新推荐文章于 2022-07-07 11:11:13 发布

weixin_46149099

最新推荐文章于 2022-07-07 11:11:13 发布

阅读量129

点赞数

分类专栏： HBase源码理解文章标签：大数据

本文链接：https://blog.youkuaiyun.com/weixin_46149099/article/details/113782315

版权

HBase源码理解专栏收录该内容

5 篇文章

订阅专栏

本文深入探讨HBase的StoreFlusher，重点关注flush操作，包括DefaultStoreFlusher和StripStoreFlusher的实现。分析了flushCache()方法、CreateScanner()和performFlush()方法的工作原理，并提及了strip compaction的实验特性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

上一篇主要梳理了Flush的大致流程，包括prepare，flush，commit三个阶段，其中最耗时，涉及磁盘IO的阶段是将memstore的快照写入临时文件的flush阶段，这篇深入flush阶段进行梳理。
Store通过调用flushCache() 方法，将快照写入临时文件，实际是调用storeEngine中，StoreFlusher的flushSnapshot方法：

StoreFlusher flusher = storeEngine.getStoreFlusher();
    IOException lastException = null;
    for (int i = 0; i < flushRetriesNumber; i++) {
      try {
        List<Path> pathNames =
            flusher.flushSnapshot(snapshot, logCacheFlushId, status, throughputController);

StoreFlusher是一个抽象类，目前HBase有两种实现，一种是DefaultStoreFlusher, 一种是StripeStoreFlusher。

StoreFlusher

StoreFlusher 包含3个非抽象方法finalizeWriter(), createScanner(), performFlush() 和一个抽象方法flushSnapshot()。

CreateScanner() 方法

protected InternalScanner createScanner(KeyValueScanner snapshotScanner,
      long smallestReadPoint) throws IOException {
    InternalScanner scanner = null;
    if (store.getCoprocessorHost() != null) {
      scanner = store.getCoprocessorHost().preFlushScannerOpen(store, snapshotScanner,
          smallestReadPoint);
    }
    if (scanner == null) {
      Scan scan = new Scan();
      scan.setMaxVersions(store.getScanInfo().getMaxVersions());
      scanner = new StoreScanner(store, store.getScanInfo(), scan,
          Collections.singletonList(snapshotScanner), ScanType.COMPACT_RETAIN_DELETES,
          smallestReadPoint, HConstants.OLDEST_TIMESTAMP);
    }
    assert scanner != null;
    if (store.getCoprocessorHost() != null) {
      try {
        return store.getCoprocessorHost().preFlush(store, scanner);
      } catch (IOException ioe) {
        scanner.close();
        throw ioe;
      }
    }
    return scanner;
  }

创建一个scanner用来扫描snapshot，由于仅扫描内存，所以scan速度会比较快，如果有coprocessor，会执行其preFlush方法。

performFlush() 方法

protected void performFlush(InternalScanner scanner, Compactor.CellSink sink,
      long smallestReadPoint, ThroughputController throughputController) throws IOException {
	int compactionKVMax =
      	conf.getInt(HConstants.COMPACTION_KV_MAX, HConstants.COMPACTION_KV_MAX_DEFAULT);

    ScannerContext scannerContext =
        ScannerContext.newBuilder().setBatchLimit(compactionKVMax).build();

    List<Cell> kvs = new ArrayList<Cell>();
    boolean hasMore;
    String flushName = ThroughputControlUtil.getNameForThrottling(store, "flush");
    boolean control = throughputController != null && !store.getRegionInfo().isSystemTable();
    if (control) {
      throughputController.start(flushName);
    }
    try {
      do {
        hasMore = scanner.next(kvs, scannerContext);
        if (!kvs.isEmpty()) {
          for (Cell c : kvs) {
            sink.append(c);
            int len = KeyValueUtil.length(c);
            if (control) {
              throughputController.control(flushName, len);
            }
          }
          kvs.clear();
        }
      } while (hasMore);
    } catch (InterruptedException e) {
      throw new InterruptedIOException("Interrupted while control throughput of flushing "
          + flushName);
    } finally {
      if (control) {
        throughputController.finish(flushName);
      }
    }
  }