数据:
hbase(main):046:0> scan 'hbaseFilter'
ROW COLUMN+CELL
row0 column=f:age, timestamp=1499150787863, value=age0
row0 column=f:name, timestamp=1499150787863, value=name0
row1 column=f:age, timestamp=1499150787875, value=age1
row1 column=f:name, timestamp=1499150787875, value=name1
row10 column=f2:age, timestamp=1499150787901, value=age10
row10 column=f2:name, timestamp=1499150787901, value=name10
row11 column=f2:age, timestamp=1499150787905, value=age11
row11 column=f2:name, timestamp=1499150787905, value=name11
row12 column=f2:age, timestamp=1499150787908, value=age12
row12 column=f2:name, timestamp=1499150787908, value=name12
row13 column=f2:age, timestamp=1499150787911, value=age13
row13 column=f2:name, timestamp=1499150787911, value=name13
row14 column=f2:age, timestamp=1499150787913, value=age14
row14 column=f2:name, timestamp=1499150787913, value=name14
row15 column=f2:age, timestamp=1499150787917, value=age15
row15 column=f2:name, timestamp=1499150787917, value=name15
row16 column=f2:age, timestamp=1499150787920, value=age16
row16 column=f2:name, timestamp=1499150787920, value=name16
row17 column=f2:age, timestamp=1499150787923, value=age17
row17 column=f2:name, timestamp=1499150787923, value=name17
row18 column=f2:age, timestamp=1499150787927, value=age18
row18 column=f2:name, timestamp=1499150787927, value=name18
row19 column=f2:age, timestamp=1499150787930, value=age19
row19 column=f2:name, timestamp=1499150787930, value=name19
row2 column=f:age, timestamp=1499150787879, value=age2
row2 column=f:name, timestamp=1499150787879, value=name2
row3 column=f:age, timestamp=1499150787882, value=age3
row3 column=f:name, timestamp=1499150787882, value=name3
row4 column=f:age, timestamp=1499150787885, value=age4
row4 column=f:name, timestamp=1499150787885, value=name4
row5 column=f:age, timestamp=1499150787888, value=age5
row5 column=f:name, timestamp=1499150787888, value=name5
row6 column=f:age, timestamp=1499150787890, value=age6
row6 column=f:name, timestamp=1499150787890, value=name6
row7 column=f:age, timestamp=1499150787893, value=age7
row7 column=f:name, timestamp=1499150787893, value=name7
row8 column=f:age, timestamp=1499150787896, value=age8
row8 column=f:name, timestamp=1499150787896, value=name8
row9 column=f:age, timestamp=1499150787898, value=age9
row9 column=f:name, timestamp=1499150787898, value=name9
20 row(s) in 0.1990 seconds
在hbase shell用show_filters命令查看一下可以用什么Filter。
hbase(main):009:0> show_filters
ColumnPrefixFilter
TimestampsFilter
PageFilter
MultipleColumnPrefixFilter
FamilyFilter
ColumnPaginationFilter
SingleColumnValueFilter
RowFilter
QualifierFilter
ColumnRangeFilter
ValueFilter
PrefixFilter
SingleColumnValueExcludeFilter
ColumnCountGetFilter
InclusiveStopFilter
DependentColumnFilter
FirstKeyOnlyFilter
KeyOnlyFilter
1.keyOnlyFilter
返回的列值全部为空。
import org.apache.hadoop.hbase.filter.KeyOnlyFilter;
scan 'hbaseFilter', {FILTER=>KeyOnlyFilter.new()}
ROW COLUMN+CELL
row0 column=f:age, timestamp=1499150787863, value=
row0 column=f:name, timestamp=1499150787863, value=
row1 column=f:age, timestamp=1499150787875, value=
row1 column=f:name, timestamp=1499150787875, value=
row10 column=f2:age, timestamp=1499150787901, value=
2.FirstKeyOnlyFilter
返回的结果每行只有第一列
hbase(main):096:0> import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter;
hbase(main):097:0* scan 'hbaseFilter',{FILTER=>FirstKeyOnlyFilter.new()}
ROW COLUMN+CELL
row0 column=f:age, timestamp=1499150787863, value=age0
row1 column=f:age, timestamp=1499150787875, value=age1
row10 column=f2:age, timestamp=1499150787901, value=age10
row11 column=f2:age, timestamp=1499150787905, value=age11
row12 column=f2:age, timestamp=1499150787908, value=age12
3.PrefixFilter
根据行的前缀过滤行。
hbase(main):106:0> import org.apache.hadoop.hbase.filter.PrefixFilter;
hbase(main):107:0* import org.apache.hadoop.hbase.util.Bytes;
hbase(main):108:0* scan 'hbaseFilter',{FILTER=>PrefixFilter.new(Bytes.toBytes('row3'))}
ROW COLUMN+CELL
row3 column=f:age, timestamp=1499150787882, value=age3
row3 column=f:name, timestamp=1499150787882, value=name3
1 row(s) in 0.1670 seconds
4.ColumnPrefixFilter
返回满足条件的列
hbase(main):113:0> import org.apache.hadoop.hbase.filter.ColumnPrefixFilter;
hbase(main):114:0* import org.apache.hadoop.hbase.util.Bytes;
hbase(main):115:0* scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>ColumnPrefixFilter.new(Bytes.toBytes('n'))}
ROW COLUMN+CELL
row0 column=f:name, timestamp=1499150787863, value=name0
row1 column=f:name, timestamp=1499150787875, value=name1
row10 column=f2:name, timestamp=1499150787901, value=name10
row11 column=f2:name, timestamp=1499150787905, value=name11
row12 column=f2:name, timestamp=1499150787908, value=name12
row13 column=f2:name, timestamp=1499150787911, value=name13
5.multipleColumnPrefixFilter
根据列名得前缀过滤,有范围,下面是列名‘a’开始到‘b’结束。
hbase(main):011:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"MultipleColumnPrefixFilter('a','b')"}
ROW COLUMN+CELL
row0 column=f:age, timestamp=1499150787863, value=age0
row1 column=f:age, timestamp=1499150787875, value=age1
row10 column=f2:age, timestamp=1499150787901, value=age10
row11 column=f2:age, timestamp=1499150787905, value=age11
row12 column=f2:age, timestamp=1499150787908, value=age12
6.ColumnCountGetFilter
返回多少列。
hbase(main):012:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"ColumnCountGetFilter(2)"}
ROW COLUMN+CELL
row0 column=f:age, timestamp=1499150787863, value=age0
row0 column=f:name, timestamp=1499150787863, value=name0
row1 column=f:age, timestamp=1499150787875, value=age1
row1 column=f:name, timestamp=1499150787875, value=name1
row10 column=f2:age, timestamp=1499150787901, value=age10
row10 column=f2:name, timestamp=1499150787901, value=name10
row11 column=f2:age, timestamp=1499150787905, value=age11
row11 column=f2:name, timestamp=1499150787905, value=name11
row12 column=f2:age, timestamp=1499150787908, value=age12
row12 column=f2:name, timestamp=1499150787908, value=name12
7. PageFilter
返回多少行
hbase(main):013:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"PageFilter(3)"}
ROW COLUMN+CELL
row0 column=f:age, timestamp=1499150787863, value=age0
row0 column=f:name, timestamp=1499150787863, value=name0
row1 column=f:age, timestamp=1499150787875, value=age1
row1 column=f:name, timestamp=1499150787875, value=name1
row10 column=f2:age, timestamp=1499150787901, value=age10
row10 column=f2:name, timestamp=1499150787901, value=name10
3 row(s) in 0.1680 seconds
8. ColumnPaginationFilter
根据limit和offset得到数据
hbase(main):014:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"ColumnPaginationFilter(2,1)"}
ROW COLUMN+CELL
row0 column=f:name, timestamp=1499150787863, value=name0
row1 column=f:name, timestamp=1499150787875, value=name1
row10 column=f2:name, timestamp=1499150787901, value=name10
row11 column=f2:name, timestamp=1499150787905, value=name11
row12 column=f2:name, timestamp=1499150787908, value=name12
row13 column=f2:name, timestamp=1499150787911, value=name13
9. InclusiveStopFilter
设置停止的行
hbase(main):015:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"InclusiveStopFilter('row15')"}
ROW COLUMN+CELL
row0 column=f:age, timestamp=1499150787863, value=age0
row0 column=f:name, timestamp=1499150787863, value=name0
row1 column=f:age, timestamp=1499150787875, value=age1
row1 column=f:name, timestamp=1499150787875, value=name1
row10 column=f2:age, timestamp=1499150787901, value=age10
row10 column=f2:name, timestamp=1499150787901, value=name10
row11 column=f2:age, timestamp=1499150787905, value=age11
row11 column=f2:name, timestamp=1499150787905, value=name11
row12 column=f2:age, timestamp=1499150787908, value=age12
row12 column=f2:name, timestamp=1499150787908, value=name12
row13 column=f2:age, timestamp=1499150787911, value=age13
row13 column=f2:name, timestamp=1499150787911, value=name13
row14 column=f2:age, timestamp=1499150787913, value=age14
row14 column=f2:name, timestamp=1499150787913, value=name14
row15 column=f2:age, timestamp=1499150787917, value=age15
row15 column=f2:name, timestamp=1499150787917, value=name15
8 row(s) in 0.2250 seconds
10. TimeStampsFilter
返回指定时间戳的数据
hbase(main):016:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"TimestampsFilter(1499150787875,1499150787913)"}
ROW COLUMN+CELL
row1 column=f:age, timestamp=1499150787875, value=age1
row1 column=f:name, timestamp=1499150787875, value=name1
row14 column=f2:age, timestamp=1499150787913, value=age14
row14 column=f2:name, timestamp=1499150787913, value=name14
2 row(s) in 0.0340 seconds
11.RowFilter
根据rowkey的值过滤
hbase(main):018:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"RowFilter(>=,'binary:row6')"}
ROW COLUMN+CELL
row6 column=f:age, timestamp=1499150787890, value=age6
row6 column=f:name, timestamp=1499150787890, value=name6
row7 column=f:age, timestamp=1499150787893, value=age7
row7 column=f:name, timestamp=1499150787893, value=name7
row8 column=f:age, timestamp=1499150787896, value=age8
row8 column=f:name, timestamp=1499150787896, value=name8
row9 column=f:age, timestamp=1499150787898, value=age9
row9 column=f:name, timestamp=1499150787898, value=name9
4 row(s) in 0.2340 seconds
12. FamilyFilter
根据列族过滤
hbase(main):020:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"FamilyFilter(=,'substring:f')"}
ROW COLUMN+CELL
row0 column=f:age, timestamp=1499150787863, value=age0
row0 column=f:name, timestamp=1499150787863, value=name0
row1 column=f:age, timestamp=1499150787875, value=age1
row1 column=f:name, timestamp=1499150787875, value=name1
row10 column=f2:age, timestamp=1499150787901, value=age10
row10 column=f2:name, timestamp=1499150787901, value=name10
row11 column=f2:age, timestamp=1499150787905, value=age11
row11 column=f2:name, timestamp=1499150787905, value=name11
13. QualifierFilter
根据列名过滤
hbase(main):023:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"QualifierFilter(=,'regexstring:n.')"}
ROW COLUMN+CELL
row0 column=f:name, timestamp=1499150787863, value=name0
row1 column=f:name, timestamp=1499150787875, value=name1
row10 column=f2:name, timestamp=1499150787901, value=name10
row11 column=f2:name, timestamp=1499150787905, value=name11
row12 column=f2:name, timestamp=1499150787908, value=name12
row13 column=f2:name, timestamp=1499150787911, value=name13
14. ValueFilter
根据值过滤,只返回匹配的列
hbase(main):024:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"ValueFilter(=,'binary:name3')"}
ROW COLUMN+CELL
row3 column=f:name, timestamp=1499150787882, value=name3
1 row(s) in 0.0140 seconds
15. SingleColumnValueFilter
根据列值返回行
hbase(main):035:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',COLUMN=>'f',FILTER=>"SingleColumnValueFilter('f','name',>=,'binary:name3')"}
ROW COLUMN+CELL
row3 column=f:age, timestamp=1499150787882, value=age3
row3 column=f:name, timestamp=1499150787882, value=name3
row4 column=f:age, timestamp=1499150787885, value=age4
row4 column=f:name, timestamp=1499150787885, value=name4
row5 column=f:age, timestamp=1499150787888, value=age5
row5 column=f:name, timestamp=1499150787888, value=name5
row6 column=f:age, timestamp=1499150787890, value=age6
row6 column=f:name, timestamp=1499150787890, value=name6
row7 column=f:age, timestamp=1499150787893, value=age7
row7 column=f:name, timestamp=1499150787893, value=name7
row8 column=f:age, timestamp=1499150787896, value=age8
row8 column=f:name, timestamp=1499150787896, value=name8
row9 column=f:age, timestamp=1499150787898, value=age9
row9 column=f:name, timestamp=1499150787898, value=name9
7 row(s) in 0.1520 seconds
16.使用AND
相当于FilterList的FilterList.Operator.MUST_PASS_ALL。
hbase(main):042:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"(FamilyFilter(=,'substring:f')) AND (ValueFilter(>,'binary:name6'))"}
ROW COLUMN+CELL
row7 column=f:name, timestamp=1499150787893, value=name7
row8 column=f:name, timestamp=1499150787896, value=name8
row9 column=f:name, timestamp=1499150787898, value=name9
3 row(s) in 0.0100 seconds
17.使用OR
相当于FilterList的FilterList.Operator.MUST_PASS_ONE。
hbase(main):044:0> scan 'hbaseFilter',{STARTROW=>'row0',STOPROW=>'row99',FILTER=>"(FamilyFilter(=,'substring:f')) AND (ValueFilter(>,'binary:name6')) OR (FamilyFilter(=,'substring:f2')) AND (ValueFilter(>,'binary:name17'))"}
ROW COLUMN+CELL
row18 column=f2:name, timestamp=1499150787927, value=name18
row19 column=f2:name, timestamp=1499150787930, value=name19
row7 column=f:name, timestamp=1499150787893, value=name7
row8 column=f:name, timestamp=1499150787896, value=name8
row9 column=f:name, timestamp=1499150787898, value=name9
5 row(s) in 0.1450 seconds
推荐一个网址http://www.hadooptpoint.org/filters-in-hbase-shell/#codesyntax_3