hbase shell Filter

本文详细介绍了如何在HBase Shell中使用Filter,包括行键过滤、列族过滤、列名过滤、列值过滤等,帮助读者理解如何通过不同条件筛选HBase表中的数据。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

数据:

hbase(main):046:0> scan 'hbaseFilter'
ROW                                              COLUMN+CELL                                                                                                                                   
 row0                                            column=f:age, timestamp=1499150787863, value=age0                                                                                             
 row0                                            column=f:name, timestamp=1499150787863, value=name0                                                                                           
 row1                                            column=f:age, timestamp=1499150787875, value=age1                                                                                             
 row1                                            column=f:name, timestamp=1499150787875, value=name1                                                                                           
 row10                                           column=f2:age, timestamp=1499150787901, value=age10                                                                                           
 row10                                           column=f2:name, timestamp=1499150787901, value=name10                                                                                         
 row11                                           column=f2:age, timestamp=1499150787905, value=age11                                                                                           
 row11                                           column=f2:name, timestamp=1499150787905, value=name11                                                                                         
 row12                                           column=f2:age, timestamp=1499150787908, value=age12                                                                                           
 row12                                           column=f2:name, timestamp=1499150787908, value=name12                                                                                         
 row13                                           column=f2:age, timestamp=1499150787911, value=age13                                                                                           
 row13                                           column=f2:name, timestamp=1499150787911, value=name13                                                                                         
 row14                                           column=f2:age, timestamp=1499150787913, value=age14                                                                                           
 row14                                           column=f2:name, timestamp=1499150787913, value=name14                                                                                         
 row15                                           column=f2:age, timestamp=1499150787917, value=age15                                                                                           
 row15                                           column=f2:name, timestamp=1499150787917, value=name15                                                                                         
 row16                                           column=f2:age, timestamp=1499150787920, value=age16                                                                                           
 row16                                           column=f2:name, timestamp=1499150787920, value=name16                                                                                         
 row17                                           column=f2:age, timestamp=1499150787923, value=age17                                                                                           
 row17                                           column=f2:name, timestamp=1499150787923, value=name17                                                                                         
 row18                                           column=f2:age, timestamp=1499150787927, value=age18                                                                                           
 row18                                           column=f2:name, timestamp=1499150787927, value=name18                                                                                         
 row19                                           column=f2:age, timestamp=1499150787930, value=age19                                                                                           
 row19                                           column=f2:name, timestamp=1499150787930, value=name19                                                                                         
 row2                                            column=f:age, timestamp=1499150787879, value=age2                                                                                             
 row2                                            column=f:name, timestamp=1499150787879, value=name2                                                                                           
 row3                                            column=f:age, timestamp=1499150787882, value=age3                                                                                             
 row3                                            column=f:name, timestamp=1499150787882, value=name3                                                                                           
 row4                                            column=f:age, timestamp=1499150787885, value=age4                                                                                             
 row4                                            column=f:name, timestamp=1499150787885, value=name4                                                                                           
 row5                                            column=f:age, timestamp=1499150787888, value=age5                                                                                             
 row5                                            column=f:name, timestamp=1499150787888, value=name5                                                                                           
 row6                                            column=f:age, timestamp=1499150787890, value=age6                                                                                             
 row6                                            column=f:name, timestamp=1499150787890, value=name6                                                                                           
 row7                                            column=f:age, timestamp=1499150787893, value=age7                                                                                             
 row7                                            column=f:name, timestamp=1499150787893, value=name7                                                                                           
 row8                                            column=f:age, timestamp=1499150787896, value=age8                                                                                             
 row8                                            column=f:name, timestamp=1499150787896, value=name8                                                                                           
 row9                                            column=f:age, timestamp=1499150787898, value=age9                                                                                             
 row9                                            column=f:name, timestamp=1499150787898, value=name9                                                                                           
20 row(s) in 0.1990 seconds

    在hbase shell用show_filters命令查看一下可以用什么Filter。

hbase(main):009:0> show_filters
ColumnPrefixFilter                                                                                                                                                                             
TimestampsFilter                                                                                                                                                                               
PageFilter                                                                                                                                                                                     
MultipleColumnPrefixFilter                                                                                                                                                                     
FamilyFilter                                                                                                                                                                                   
ColumnPaginationFilter                                                                                                                                                                         
SingleColumnValueFilter                                                                                                                                                                        
RowFilter                                                                                                                                                                                      
QualifierFilter                                                                                                                                                                                
ColumnRangeFilter                                                                                                                                                                              
ValueFilter                                                                                                                                                                                    
PrefixFilter                                                                                                                                                                                   
SingleColumnValueExcludeFilter                                                                                                                                                                 
ColumnCountGetFilter                                                                                                                                                                           
InclusiveStopFilter                                                                                                                                                                            
DependentColumnFilter                                                                                                                                                                          
FirstKeyOnlyFilter                                                                                                                                                                             
KeyOnlyFilter     

1.keyOnlyFilter

    返回的列值全部为空。

import org.apache.hadoop.hbase.filter.KeyOnlyFilter;
scan 'hbaseFilter', {FILTER=>KeyOnlyFilter.new()}
ROW                                              COLUMN+CELL                                                                                                                                   
 row0                                            column=f:age, timestamp=1499150787863, value=                                                                                                 
 row0                                            column=f:name, timestamp=1499150787863, value=                                                                                                
 row1                                            column=f:age, timestamp=1499150787875, value=                                                                                                 
 row1                                            column=f:name, timestamp=1499150787875, value=                                                                                                
 row10                                           column=f2:age, timestamp=1499150787901, value=   

2.FirstKeyOnlyFilter

    返回的结果每行只有第一列

hbase(main):096:0> import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter;
hbase(main):097:0* scan 'hbaseFilter',{FILTER=>FirstKeyOnlyFilter.new()}
ROW                                              COLUMN+CELL                                                                                                                                   
 row0                                            column=f:age, timestamp=1499150787863, value=age0                                                                                             
 row1                                            column=f:age, timestamp=1499150787875, value=age1                                                                                             
 row10                                           column=f2:age, timestamp=1499150787901, value=age10                                                                                           
 row11                                           column=f2:age, timestamp=1499150787905, value=age11                                                                                           
 row12                                           column=f2:age, timestamp=1499150787908, value=age12  

3.PrefixFilter

     根据行的前缀过滤行。

hbase(main):106:0> import org.apache.hadoop.hbase.filter.PrefixFilter;
hbase(main):107:0* import org.apache.hadoop.hbase.util.Bytes;
hbase(main):108:0* scan 'hbaseFilter',{FILTER=>PrefixFilter.new(Bytes.toBytes('row3'))}
ROW                                              COLUMN+CELL                                                                                                                                   
 row3                                            column=f:age, timestamp=1499150787882, value=age3                                                                                             
 row3                                            column=f:name, timestamp=1499150787882, value=name3                                                                                           
1 row(s) in 0.1670 seconds

4.ColumnPrefixFilter

    返回满足条件的列

hbase(main):113:0> import org.apache.hadoop.hbase.filter.ColumnPrefixFilter;
hbase(main):114:0* import org.apache.hadoop.hbase.util.Bytes;
hbase(main):115:0* scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>ColumnPrefixFilter.new(Bytes.toBytes('n'))}
ROW                                              COLUMN+CELL                                                                                                                                   
 row0                                            column=f:name, timestamp=1499150787863, value=name0                                                                                           
 row1                                            column=f:name, timestamp=1499150787875, value=name1                                                                                           
 row10                                           column=f2:name, timestamp=1499150787901, value=name10                                                                                         
 row11                                           column=f2:name, timestamp=1499150787905, value=name11                                                                                         
 row12                                           column=f2:name, timestamp=1499150787908, value=name12                                                                                         
 row13                                           column=f2:name, timestamp=1499150787911, value=name13   

5.multipleColumnPrefixFilter

    根据列名得前缀过滤,有范围,下面是列名‘a’开始到‘b’结束。

hbase(main):011:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"MultipleColumnPrefixFilter('a','b')"}
ROW                                              COLUMN+CELL                                                                                                                                   
 row0                                            column=f:age, timestamp=1499150787863, value=age0                                                                                             
 row1                                            column=f:age, timestamp=1499150787875, value=age1                                                                                             
 row10                                           column=f2:age, timestamp=1499150787901, value=age10                                                                                           
 row11                                           column=f2:age, timestamp=1499150787905, value=age11                                                                                           
 row12                                           column=f2:age, timestamp=1499150787908, value=age12    

6.ColumnCountGetFilter

   返回多少列。

hbase(main):012:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"ColumnCountGetFilter(2)"}
ROW                                              COLUMN+CELL                                                                                                                                   
 row0                                            column=f:age, timestamp=1499150787863, value=age0                                                                                             
 row0                                            column=f:name, timestamp=1499150787863, value=name0                                                                                           
 row1                                            column=f:age, timestamp=1499150787875, value=age1                                                                                             
 row1                                            column=f:name, timestamp=1499150787875, value=name1                                                                                           
 row10                                           column=f2:age, timestamp=1499150787901, value=age10                                                                                           
 row10                                           column=f2:name, timestamp=1499150787901, value=name10                                                                                         
 row11                                           column=f2:age, timestamp=1499150787905, value=age11                                                                                           
 row11                                           column=f2:name, timestamp=1499150787905, value=name11                                                                                         
 row12                                           column=f2:age, timestamp=1499150787908, value=age12                                                                                           
 row12                                           column=f2:name, timestamp=1499150787908, value=name12   

7. PageFilter

   返回多少行

hbase(main):013:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"PageFilter(3)"}
ROW                                              COLUMN+CELL                                                                                                                                   
 row0                                            column=f:age, timestamp=1499150787863, value=age0                                                                                             
 row0                                            column=f:name, timestamp=1499150787863, value=name0                                                                                           
 row1                                            column=f:age, timestamp=1499150787875, value=age1                                                                                             
 row1                                            column=f:name, timestamp=1499150787875, value=name1                                                                                           
 row10                                           column=f2:age, timestamp=1499150787901, value=age10                                                                                           
 row10                                           column=f2:name, timestamp=1499150787901, value=name10                                                                                         
3 row(s) in 0.1680 seconds

8. ColumnPaginationFilter

   根据limit和offset得到数据

hbase(main):014:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"ColumnPaginationFilter(2,1)"}
ROW                                              COLUMN+CELL                                                                                                                                   
 row0                                            column=f:name, timestamp=1499150787863, value=name0                                                                                           
 row1                                            column=f:name, timestamp=1499150787875, value=name1                                                                                           
 row10                                           column=f2:name, timestamp=1499150787901, value=name10                                                                                         
 row11                                           column=f2:name, timestamp=1499150787905, value=name11                                                                                         
 row12                                           column=f2:name, timestamp=1499150787908, value=name12                                                                                         
 row13                                           column=f2:name, timestamp=1499150787911, value=name13  

9. InclusiveStopFilter

   设置停止的行

hbase(main):015:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"InclusiveStopFilter('row15')"}
ROW                                              COLUMN+CELL                                                                                                                                   
 row0                                            column=f:age, timestamp=1499150787863, value=age0                                                                                             
 row0                                            column=f:name, timestamp=1499150787863, value=name0                                                                                           
 row1                                            column=f:age, timestamp=1499150787875, value=age1                                                                                             
 row1                                            column=f:name, timestamp=1499150787875, value=name1                                                                                           
 row10                                           column=f2:age, timestamp=1499150787901, value=age10                                                                                           
 row10                                           column=f2:name, timestamp=1499150787901, value=name10                                                                                         
 row11                                           column=f2:age, timestamp=1499150787905, value=age11                                                                                           
 row11                                           column=f2:name, timestamp=1499150787905, value=name11                                                                                         
 row12                                           column=f2:age, timestamp=1499150787908, value=age12                                                                                           
 row12                                           column=f2:name, timestamp=1499150787908, value=name12                                                                                         
 row13                                           column=f2:age, timestamp=1499150787911, value=age13                                                                                           
 row13                                           column=f2:name, timestamp=1499150787911, value=name13                                                                                         
 row14                                           column=f2:age, timestamp=1499150787913, value=age14                                                                                           
 row14                                           column=f2:name, timestamp=1499150787913, value=name14                                                                                         
 row15                                           column=f2:age, timestamp=1499150787917, value=age15                                                                                           
 row15                                           column=f2:name, timestamp=1499150787917, value=name15                                                                                         
8 row(s) in 0.2250 seconds

10. TimeStampsFilter

   返回指定时间戳的数据

hbase(main):016:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"TimestampsFilter(1499150787875,1499150787913)"}
ROW                                              COLUMN+CELL                                                                                                                                   
 row1                                            column=f:age, timestamp=1499150787875, value=age1                                                                                             
 row1                                            column=f:name, timestamp=1499150787875, value=name1                                                                                           
 row14                                           column=f2:age, timestamp=1499150787913, value=age14                                                                                           
 row14                                           column=f2:name, timestamp=1499150787913, value=name14                                                                                         
2 row(s) in 0.0340 seconds

11.RowFilter

   根据rowkey的值过滤

hbase(main):018:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"RowFilter(>=,'binary:row6')"}
ROW                                              COLUMN+CELL                                                                                                                                   
 row6                                            column=f:age, timestamp=1499150787890, value=age6                                                                                             
 row6                                            column=f:name, timestamp=1499150787890, value=name6                                                                                           
 row7                                            column=f:age, timestamp=1499150787893, value=age7                                                                                             
 row7                                            column=f:name, timestamp=1499150787893, value=name7                                                                                           
 row8                                            column=f:age, timestamp=1499150787896, value=age8                                                                                             
 row8                                            column=f:name, timestamp=1499150787896, value=name8                                                                                           
 row9                                            column=f:age, timestamp=1499150787898, value=age9                                                                                             
 row9                                            column=f:name, timestamp=1499150787898, value=name9                                                                                           
4 row(s) in 0.2340 seconds

12. FamilyFilter

   根据列族过滤

hbase(main):020:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"FamilyFilter(=,'substring:f')"}
ROW                                              COLUMN+CELL                                                                                                                                   
 row0                                            column=f:age, timestamp=1499150787863, value=age0                                                                                             
 row0                                            column=f:name, timestamp=1499150787863, value=name0                                                                                           
 row1                                            column=f:age, timestamp=1499150787875, value=age1                                                                                             
 row1                                            column=f:name, timestamp=1499150787875, value=name1                                                                                           
 row10                                           column=f2:age, timestamp=1499150787901, value=age10                                                                                           
 row10                                           column=f2:name, timestamp=1499150787901, value=name10                                                                                         
 row11                                           column=f2:age, timestamp=1499150787905, value=age11                                                                                           
 row11                                           column=f2:name, timestamp=1499150787905, value=name11     

13. QualifierFilter

   根据列名过滤

hbase(main):023:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"QualifierFilter(=,'regexstring:n.')"}
ROW                                              COLUMN+CELL                                                                                                                                   
 row0                                            column=f:name, timestamp=1499150787863, value=name0                                                                                           
 row1                                            column=f:name, timestamp=1499150787875, value=name1                                                                                           
 row10                                           column=f2:name, timestamp=1499150787901, value=name10                                                                                         
 row11                                           column=f2:name, timestamp=1499150787905, value=name11                                                                                         
 row12                                           column=f2:name, timestamp=1499150787908, value=name12                                                                                         
 row13                                           column=f2:name, timestamp=1499150787911, value=name13  

14. ValueFilter

   根据值过滤,只返回匹配的列

hbase(main):024:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"ValueFilter(=,'binary:name3')"}
ROW                                              COLUMN+CELL                                                                                                                                   
 row3                                            column=f:name, timestamp=1499150787882, value=name3                                                                                           
1 row(s) in 0.0140 seconds

15. SingleColumnValueFilter

   根据列值返回行

hbase(main):035:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',COLUMN=>'f',FILTER=>"SingleColumnValueFilter('f','name',>=,'binary:name3')"}
ROW                                              COLUMN+CELL                                                                                                                                   
 row3                                            column=f:age, timestamp=1499150787882, value=age3                                                                                             
 row3                                            column=f:name, timestamp=1499150787882, value=name3                                                                                           
 row4                                            column=f:age, timestamp=1499150787885, value=age4                                                                                             
 row4                                            column=f:name, timestamp=1499150787885, value=name4                                                                                           
 row5                                            column=f:age, timestamp=1499150787888, value=age5                                                                                             
 row5                                            column=f:name, timestamp=1499150787888, value=name5                                                                                           
 row6                                            column=f:age, timestamp=1499150787890, value=age6                                                                                             
 row6                                            column=f:name, timestamp=1499150787890, value=name6                                                                                           
 row7                                            column=f:age, timestamp=1499150787893, value=age7                                                                                             
 row7                                            column=f:name, timestamp=1499150787893, value=name7                                                                                           
 row8                                            column=f:age, timestamp=1499150787896, value=age8                                                                                             
 row8                                            column=f:name, timestamp=1499150787896, value=name8                                                                                           
 row9                                            column=f:age, timestamp=1499150787898, value=age9                                                                                             
 row9                                            column=f:name, timestamp=1499150787898, value=name9                                                                                           
7 row(s) in 0.1520 seconds

16.使用AND

   相当于FilterList的FilterList.Operator.MUST_PASS_ALL

hbase(main):042:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"(FamilyFilter(=,'substring:f')) AND (ValueFilter(>,'binary:name6'))"}
ROW                                              COLUMN+CELL                                                                                                                                   
 row7                                            column=f:name, timestamp=1499150787893, value=name7                                                                                           
 row8                                            column=f:name, timestamp=1499150787896, value=name8                                                                                           
 row9                                            column=f:name, timestamp=1499150787898, value=name9                                                                                           
3 row(s) in 0.0100 seconds

17.使用OR

   相当于FilterList的FilterList.Operator.MUST_PASS_ONE

hbase(main):044:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"(FamilyFilter(=,'substring:f')) AND (ValueFilter(>,'binary:name6')) OR (FamilyFilter(=,'substring:f2')) AND (ValueFilter(>,'binary:name17'))"}
ROW                                              COLUMN+CELL                                                                                                                                   
 row18                                           column=f2:name, timestamp=1499150787927, value=name18                                                                                         
 row19                                           column=f2:name, timestamp=1499150787930, value=name19                                                                                         
 row7                                            column=f:name, timestamp=1499150787893, value=name7                                                                                           
 row8                                            column=f:name, timestamp=1499150787896, value=name8                                                                                           
 row9                                            column=f:name, timestamp=1499150787898, value=name9                                                                                           
5 row(s) in 0.1450 seconds

推荐一个网址http://www.hadooptpoint.org/filters-in-hbase-shell/#codesyntax_3


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值