hbase中regionserver的flush过程

最新推荐文章于 2022-06-20 13:15:42 发布

翻译最新推荐文章于 2022-06-20 13:15:42 发布 · 949 阅读

hbase 专栏收录该内容

1 篇文章

订阅专栏

本文详细介绍了HBase中memstore的关键配置参数，包括regionServer的全局memstore大小及其触发flush的条件、单个region内memstore的大小限制、内存文件自动刷新间隔等。这些配置对于确保HBase系统的稳定性和性能至关重要。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1、regionServer的全局memstore的大小，超过该大小会触发flush到磁盘的操作,默认是堆大小的40%,

而且regionserver级别的flush会阻塞客户端读写

<property>
<name>hbase.regionserver.global.memstore.size</name>
<value></value>
<description>Maximum size of all memstores in a region server before
new
updates are blocked and flushes are forced. Defaults to 40% of heap (0.4).
Updates are blocked and flushes are forced until size of all
memstores
in a region server hits
hbase.regionserver.global.memstore.size.lower.limit.
The default value in this configuration has been intentionally left
emtpy in order to
honor the old hbase.regionserver.global.memstore.upperLimit property if
present.
</description>
</property>

2、可以理解为一个安全的设置，有时候集群的“写负载”非常高，写入量一直超过flush的量，这时，我们就希望memstore不要超过一定的安全设置。
在这种情况下，写操作就要被阻塞一直到memstore恢复到一个“可管理”的大小, 这个大小就是默认值是堆大小 * 0.4 * 0.95，也就是当regionserver级别的flush操作发送后,会阻塞客户端写,一直阻塞到整个regionserver级别的memstore的大小为堆大小 * 0.4 *0.95为止

<property>
<name>hbase.regionserver.global.memstore.size.lower.limit</name>
<value></value>
<description>Maximum size of all memstores in a region server before
flushes are forced.
Defaults to 95% of hbase.regionserver.global.memstore.size (0.95).
A 100% value for this value causes the minimum possible flushing to
occur when updates are
blocked due to memstore limiting.
The default value in this configuration has been intentionally left
emtpy in order to
honor the old hbase.regionserver.global.memstore.lowerLimit property if
present.
</description>
</property>

3、单个region里memstore的缓存大小，超过那么整个HRegion就会flush,默认128M

<property>
<name>hbase.hregion.memstore.flush.size</name>
<value>134217728</value>
<description>
Memstore will be flushed to disk if size of the memstore
exceeds this number of bytes. Value is checked by a thread that runs
every hbase.server.thread.wakefrequency.
</description>
</property>

4、内存中的文件在自动刷新之前能够存活的最长时间，默认是1h

<property>
<name>hbase.regionserver.optionalcacheflushinterval</name>
<value>3600000</value>
<description>
Maximum amount of time an edit lives in memory before being automatically
flushed.
Default 1 hour. Set it to 0 to disable automatic flushing.
</description>
</property>

5、当一个 region 中的 memstore 的大小大于这个值的时候，我们又触发了 close.会先运行“pre-flush”操作，清理这个需要关闭的memstore，然后将这个 region 下线。当一个 region 下线了，我们无法再进行任何写操作。如果一个 memstore 很大的时候，flush操作会消耗很多时间。"pre-flush" 操作意味着在 region 下线之前，会先把 memstore 清空。这样在最终执行 close 操作的时候，flush 操作会很快。

<property>
<name>hbase.hregion.preclose.flush.size</name>
<value>5242880</value>
<description>
If the memstores in a region are this size or larger when we go
to close, run a "pre-flush" to clear out memstores before we put up
the region closed flag and take the region offline. On close,
a flush is run under the close flag to empty memory. During
this time the region is offline and we are not taking on any writes.
If the memstore content is large, this flush could take a long time to
complete. The preflush is meant to clean out the bulk of the memstore
before putting up the close flag and taking the region offline so the
flush that runs under the close flag has little to do.
</description>
</property>