HBase Version: hbase-0.94.6-cdh4.3.0
在HBase Scan中有一优化: 使用 scannerCaching&caching.
意思是HBaseClient从HBase服务器一次取得多少条数据回来,减少从服务器来回取数据的次数,可以设置一次从HBase服务器取scannerCaching&caching条数据.
其中scannerCaching是HTable的属性;caching是Scan的属性;
在HTable源码中可以发现, 此两个方法都已经过时了.
public class HTable implements HTableInterface {
protected int scannerCaching;
/**
* Gets the number of rows that a scanner will fetch at once.
* <p>
* The default value comes from {@code hbase.client.scanner.caching}.
* @deprecated Use {@link Scan#setCaching(int)} and {@link Scan#getCaching()}
*/
public int getScannerCaching() {
return scannerCaching;
}
/**
* Sets the number of rows that a scanner will fetch at once.
* <p>
* This will override the value specified by
* {@code hbase.client.scanner.caching}.
* Increasing this value will reduce the amount of work needed each time
* {@code next()} is called on a scanner, at the expense of memory use
* (since more rows will need to be maintained in memory by the scanners).
* @param scannerCaching the number of rows a scanner will fetch at once.
* @deprecated Use {@link Scan#setCaching(int)}
*/
public void setScannerCaching(int scannerCaching) {
this.scannerCaching = scannerCaching;
}
}
要我们在使用中, 使用Scan的{@link Scan#setCaching(int)} and {@link Scan#getCaching()}
通过HTable源码知道, HTable进行Scan时调用, 返回ResultScanner,再对查询出的ResultScanner进行处理;
/**
* {@inheritDoc}
*/
@Override
public ResultScanner getScanner(final Scan scan) throws IOException {
if (scan.getCaching() <= 0) {
scan.setCaching(getScannerCaching());
}
return new ClientScanner(getConfiguration(), scan, getTableName(),
this.connection);
}
通过上面代码知道,HBase HTable设置的scannerCaching是赋值到scan上的.
1. 在ClientScanner代码中获取了从HTable中过来的scannerCaching;
2. 当然如果HTable中过来的scannerCaching;没有设置(scannerCaching=0); 则ClientScanner中的caching保留原值;
/**
* Create a new ClientScanner for the specified table
* Note that the passed {@link Scan}'s start row maybe changed changed.
*
* @param conf The {@link Configuration} to use.
* @param scan {@link Scan} to use in this scanner
* @param tableName The table that we wish to scan
* @param connection Connection identifying the cluster
* @throws IOException
*/
public ClientScanner(final Configuration conf, final Scan scan,
final byte[] tableName, HConnection connection) throws IOException {
if (LOG.isDebugEnabled()) {
LOG.debug("Creating scanner over "
+ Bytes.toString(tableName)
+ " starting at key '" + Bytes.toStringBinary(scan.getStartRow()) + "'");
}
this.scan = scan;
this.tableName = tableName;
this.lastNext = System.currentTimeMillis();
this.connection = connection;
this.maxScannerResultSize = conf.getLong(
HConstants.HBASE_CLIENT_SCANNER_MAX_RESULT_SIZE_KEY,
HConstants.DEFAULT_HBASE_CLIENT_SCANNER_MAX_RESULT_SIZE);
this.scannerTimeout = (int) conf.getLong(
HConstants.HBASE_REGIONSERVER_LEASE_PERIOD_KEY,
HConstants.DEFAULT_HBASE_REGIONSERVER_LEASE_PERIOD);
// check if application wants to collect scan metrics
byte[] enableMetrics = scan.getAttribute(
Scan.SCAN_ATTRIBUTES_METRICS_ENABLE);
if (enableMetrics != null && Bytes.toBoolean(enableMetrics)) {
scanMetrics = new ScanMetrics();
}
// Use the caching from the Scan. If not set, use the default cache setting for this table.
if (this.scan.getCaching() > 0) {
this.caching = this.scan.getCaching();
} else {
this.caching = conf.getInt("hbase.client.scanner.caching", 1);
}
// initialize the scanner
nextScanner(this.caching, false);
}
3. 还有一个地方就是如果没有设置1和2(HTable和Scan都没有设置 scannerCaching&caching),;
杯具就来了:使用默认的hbase.client.scanner.caching=1
本文详细介绍了HBase中ScannerCaching与Caching的区别及使用方式,解释了这两个参数如何影响扫描性能,并展示了在HBase客户端代码中如何正确设置这些参数。
908

被折叠的 条评论
为什么被折叠?



