hbase

http://grokbase.com/t/hbase/user/125ya2cxxs/scan-addfamily-vs-familyfilter-equal

http://stackoverflow.com/questions/7256100/scan-with-filter-using-hbase-shell

 

Just to add on.
The java doc clearly says in FamilyFilter that

* If an already known column family is looked for, use {@link
org.apache.hadoop.hbase.client.Get#addFamily(byte[])}
* directly rather than a filter.

So addFamily should be better.

Regards
Ram

-----Original Message-----
From: Anoop Sam John
Sent: Thursday, May 31, 2012 11:49 AM
To: user@hbase.apache.org
Subject: RE: Scan addFamily vs FamilyFilter(EQUAL, ...)

Hi,
As per my understanding of the Scan code in your scenario where
you want to go with scanning of some CFs ( not all) You go with
Scan#addFamily.
The FamilyFilter also doing the same thing. But there is a difference
in the performance.
When one specify the CFs in the scan, the scanner will be created for
only those many Stores. For the other CFs, there wont be any scanners
and so those stores are not scanned. ( The HFile data is not fetched )
Instead when one use the FamilyFilter and not specify any specific
columns (using Scan#addFamily) all the stores will get scanned and data
will get fetched from HFiles. Later these KVs corresponding to which
you needed (as per your FamilyFilter) only will get included in the
Result and others just avoided. So there will be performance
difference I feel.. Correct me if I am wrong pls...

@Stack
One thing I ran into when using the Scan.addFamily / Scan.addColumn is
that those two methods overwrite each other.
In the Scan#addColumn javadoc it is clearly telling about this
overwrites... So this seems intentionally done correct?


-Anoop-
________________________________________
From: saint.ack@gmail.com [saint.ack@gmail.com] on behalf of Stack
[stack@duboce.net]
Sent: Wednesday, May 30, 2012 11:13 PM
To: user@hbase.apache.org
Subject: Re: Scan addFamily vs FamilyFilter(EQUAL, ...)
On Wed, May 30, 2012 at 9:59 AM, Kevin wrote:
I am curious and trying to learn which method is best when wanting to limit
a scan to a particular column or column family. The Scan class carries a
Filter instance and a TreeMap of the family map and I am unsure how they
get carried through to the server-side functionality. In terms of
performance is there any difference between doing Scan.addFamily(x) and
Scan.setFilter(new FamilyFilter(CompareFilter.CompareOp.EQUAL, x)?
There is probably not noticeable difference in performance but
Scan#addFamily is the more natural way of expressing column family
scoping.
St.Ack
 
 
 

转载于:https://www.cnblogs.com/jvava/p/4580956.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值