本文开始之前先进行一下hadoop2.2的安装,推荐文章地址:
http://blog.youkuaiyun.com/u010670689/article/details/30495989
去apache官网下载 hbase-0.96.0-hadoop2-bin.tar.gz 解压到/cloud目录下
进入目录/cloud/hbase-0.96.1.1-hadoop2
1.cd 到/home/hadoop/hbase-0.96.0-hadoop2/conf下 vi hbase-env.sh设置正确的java_home
export JAVA_HOME=/usr/java/jdk1.7.0
配置hbase使用自己的zookeeper
export HBASE_MANAGES_ZK=false
2.配置hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>//如果有namenode ha的话,选那个运行的
</property>
<property>
<name>hbase.master</name>
<value>hdfs://master:60000</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/data/zookeeper</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave1,slave2</value>
</property>
</configuration>
3.vi regionservers 配置regionServer,即真正存放数据信息的地方,这里我们把master,slave1,slave2设置为regionserver。每个节点一行
master
slave1
slave2
4.hbase基本的配置差不多了。但还有最重要的一个,我们从hbase的lib目录下我看到它是基于hadoop2.1.0的,所以我们需要用我们的hadoop2.2.0下的jar包来替换2.1的,保证版本的一致性,hadoop下的jar包都是在$HADOOP_HOME/share/hadoop下的,我们先cd 到 /home/hadoop/hbase-0.96.0-hadoop2/lib下运行命令: rm -rf hadoop*.jar删掉所有的hadoop相关的jar包,然后运行:
find /home/hadoop/hadoop2.2.0/share/hadoop -name "hadoop*jar" | xargs -i cp {} /home/hadoop/hbase-0.96.0-hadoop2/lib/ 拷贝所有hadoop2.2.0下的jar包hbase下进行hadoop版本的统一(我的hbase版本没有问题)
5.接下来就可以启动hbase了,cd到HABSE_HOME/bin下执行./start-hbase.sh,如果启动成功,用jps就能查看到HMaster应用已经起来了,而且在slave1等其他regionServer上能看到HRegionServer和HQuorumPeer。你也可以在此bin目录下运行./hbase shell进入hbase的client端,运行list或create 'test','cf1' 等命令检查一下是否成功启动,同时你也可以到浏览器打开 http://master:60010/master-status 看界面端。
HBASE的基本使用方法:
建议先看下这个帖子,hbase整体架构介绍:http://blog.youkuaiyun.com/u010670689/article/details/33384449
hbase组成部分的具体含义如下:
(1) Row Key
与nosql数据库们一样,row key是用来检索记录的主键。访问hbase table中的行,只有三种方式:
(1.1) 通过单个row key访问
(1.2) 通过row key的range
(1.3) 全表扫描
Row key行键 (Row key)可以是任意字符串(最大长度是 64KB,实际应用中长度一般为 10-100bytes),在hbase内部,row key保存为字节数组。
存储时,数据按照Row key的字典序(byte order)排序存储。设计key时,要充分排序存储这个特性,将经常一起读取的行存储放到一起。(位置相关性)
注意:
字典序对int排序的结果是1,10,100,11,12,13,14,15,16,17,18,19,2,20,21,…,9,91,92,93,94,95,96,97,98,99。要保持整形的自然序,行键必须用0作左填充。
行的一次读写是原子操作 (不论一次读写多少列)。这个设计决策能够使用户很容易的理解程序在对同一个行进行并发更新操作时的行为。
(2) 列族 column family
hbase表中的每个列,都归属与某个列族。列族是表的chema的一部分(而列不是),必须在使用表之前定义。列名都以列族作为前缀。例如courses:history , courses:math 都属于 courses 这个列族。
访问控制、磁盘和内存的使用统计都是在列族层面进行的。实际应用中,列族上的控制权限能帮助我们管理不同类型的应用:我们允许一些应用可以添加新的基本数据、一些应用可以读取基本数据并创建继承的列族、一些应用则只允许浏览数据(甚至可能因为隐私的原因不能浏览所有数据)。
(3) 单元 Cell
HBase中通过row和columns确定的为一个存贮单元称为cell。由{row key, column( =<family> + <label>), version} 唯一确定的单元。cell中的数据是没有类型的,全部是字节码形式存贮。
(4) 时间戳 timestamp
每个cell都保存着同一份数据的多个版本。版本通过时间戳来索引。时间戳的类型是 64位整型。时间戳可以由hbase(在数据写入时自动 )赋值,此时时间戳是精确到毫秒的当前系统时间。时间戳也可以由客户显式赋值。如果应用程序要避免数据版本冲突,就必须自己生成具有唯一性的时间戳。每个cell中,不同版本的数据按照时间倒序排序,即最新的数据排在最前面。
为了避免数据存在过多版本造成的的管理 (包括存贮和索引)负担,hbase提供了两种数据版本回收方式。一是保存数据的最后n个版本,二是保存最近一段时间内的版本(比如最近七天)。用户可以针对每个列族进行设置。
HBase shell的基本用法
hbase提供了一个shell的终端给用户交互。使用命令hbase shell进入命令界面。通过执行 help可以看到命令的帮助信息。
以网上的一个学生成绩表的例子来演示hbase的用法。
name | grad | course | |
math | art | ||
Tom | 5 | 97 | 87 |
Jim | 4 | 89 | 80 |
这里grad对于表来说是一个只有它自己的列族,course对于表来说是一个有3个列的列族,这个列族由3个列组成name,math和art,当然我们可以根据我们的需要在course中建立更多的列族,如computer,physics等相应的列添加入course列族。
1.创建表(tab键自动补全)
hbase(main):018:0> create 'courses','name','grade','course' //创建的时候指定到列族就可以,不用指定到列
0 row(s) in 0.4320 seconds
=> Hbase::Table - courses
2.查看所有表,使用describe命令来查看表结构
hbase(main):019:0> list
TABLE
courses
1 row(s) in 0.0350 seconds
=> ["courses"]
3.删除表
hbase(main):020:0> disable 'courses' 首先停用表
0 row(s) in 1.4860 seconds
hbase(main):021:0> drop 'courses' 然后删除表
0 row(s) in 0.1750 seconds
4.插入数据
hbase(main):007:0> put 'courses','tom','grade:','5'
0 row(s) in 0.0900 seconds
5.查询数据
hbase(main):008:0> scan 'courses'
ROW COLUMN+CELL
tom column=grade:, timestamp=1403489817691, value=5
1 row(s) in 0.1260 second
6.继续插入
hbase(main):009:0> put 'courses','tom','course:math','97'
0 row(s) in 0.0350 seconds
hbase(main):010:0> put 'courses','tom','course:art','87'
0 row(s) in 0.0070 seconds
hbase(main):011:0> put 'courses','jarry','grade','4'
0 row(s) in 0.0100 seconds
hbase(main):014:0> put 'courses','jarry','course:','98'
0 row(s) in 0.0080 seconds
hbase(main):015:0> put 'courses','jarry','course:','89'
0 row(s) in 0.0070 seconds
这样表结构就起来了,其实比较自由,列族里边可以自由添加子列很方便。如果列族下没有子列,加不加冒号都是可以的。
*****put命令比较简单,只有这一种用法:
hbase> put ‘t1′, ‘r1′, ‘c1′, ‘value', ts1
t1指表名,r1指行键名,c1指列名,value指单元格值。ts1指时间戳,一般都省略掉了。
7.根据键值查询数据
hbase(main):016:0> get 'courses','tom'
COLUMN CELL
course:art timestamp=1403489905110, value=87
course:math timestamp=1403489894414, value=97
grade: timestamp=1403489817691, value=5
3 row(s) in 0.0380 seconds
hbase(main):017:0> get 'courses','Tom' //rowkey区分大小写的问题
COLUMN CELL
0 row(s) in 0.0100 seconds
hbase(main):019:0> get 'courses','tom','course' //检索指定列族
COLUMN CELL
course:art timestamp=1403489905110, value=87
course:math timestamp=1403489894414, value=97
2 row(s) in 0.0070 seconds
hbase(main):020:0> get 'courses','tom','course:math' //检索指定列
COLUMN CELL
course:math timestamp=1403489894414, value=97
1 row(s) in 0.0110 seconds
可能你就发现规律了,HBase的shell操作,一个大概顺序就是操作关键词后跟表名,行名,列名这样的一个顺序,如果有其他条件再用花括号加上。
再插几条数据:
hbase(main):021:0> put 'courses','tom','course:math','100'
0 row(s) in 0.0180 seconds
hbase(main):022:0> put 'courses','tom','course:math','67'
0 row(s) in 0.0070 seconds
hbase(main):023:0> put 'courses','tom','course:math','89'
0 row(s) in 0.0080 seconds
查询操作:
hbase(main):026:0> get 'courses' //这个是错误的,get至少要指定一个rowkey的值
hbase(main):027:0> get 'courses','tom',{TIMERANGE=>[1403489905110,1403490514442]} //指定时间戳的范围
COLUMN CELL
course:art timestamp=1403489905110, value=87
course:math timestamp=1403490509676, value=67
2 row(s) in 0.0240 seconds
hbase(main):028:0> get 'courses','tom',{TIMERANGE=>[1403489905110,1403490514442],COLUMN=>'course'}//指定时间戳的范围和列名
COLUMN CELL
course:art timestamp=1403489905110, value=87
course:math timestamp=1403490509676, value=67
2 row(s) in 0.0120 seconds
hbase(main):030:0> get 'courses','tom',{COLUMN=>['course','grade']}//指定多个列名
COLUMN CELL
course:art timestamp=1403489905110, value=87
course:math timestamp=1403490514442, value=89
grade: timestamp=1403489817691, value=5
3 row(s) in 0.0120 seconds
hbase(main):068:0> get 'courses','tom',{COLUMN=>'course',TIMERANGE=>[1103489905110,1903490514442],VERSIONS=>10}//指定时间戳的范围和版本号
COLUMN CELL
course:art timestamp=1403489905110, value=87
course:math timestamp=1403491556319, value=100
course:math timestamp=1403490514442, value=89
3 row(s) in 0.0140 seconds
如果hbase的版本号不合适,course:math可能存储的不够,可以修改:
hbase(main):059:0> describe 'courses'
DESCRIPTION ENABLED
'courses', {NAME => 'course', DATA_BLOCK_ENCODING = true
> 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE =
> '0', VERSIONS => '5', COMPRESSION => 'NONE', MIN_
VERSIONS => '0', TTL => '2147483647', KEEP_DELETED_
CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY =
> 'false', BLOCKCACHE => 'true'}, {NAME => 'grade',
DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW
', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPR
ESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147
483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE =
> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'tru
e'}, {NAME => 'name', DATA_BLOCK_ENCODING => 'NONE'
, BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', V
ERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS
=> '0', TTL => '2147483647', KEEP_DELETED_CELLS =>
'false', BLOCKSIZE => '65536', IN_MEMORY => 'false
', BLOCKCACHE => 'true'}
1 row(s) in 0.1090 seconds
hbase(main):060:0> disable 'courses'
0 row(s) in 1.4540 seconds
hbase(main):061:0> alter 'courses',NAME=>'course',VERSIONS=>4
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 1.1730 seconds
hbase(main):062:0> enable 'courses'
0 row(s) in 1.4250 seconds
hbase(main):065:0> describe 'courses'
DESCRIPTION ENABLED
'courses', {NAME => 'course', DATA_BLOCK_ENCODING = true
> 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE =
> '0', VERSIONS => '4', COMPRESSION => 'NONE', MIN_
VERSIONS => '0', TTL => '2147483647', KEEP_DELETED_
CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY =
> 'false', BLOCKCACHE => 'true'}, {NAME => 'grade',
DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW
', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPR
ESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147
483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE =
> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'tru
e'}, {NAME => 'name', DATA_BLOCK_ENCODING => 'NONE'
, BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', V
ERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS
=> '0', TTL => '2147483647', KEEP_DELETED_CELLS =>
'false', BLOCKSIZE => '65536', IN_MEMORY => 'false
', BLOCKCACHE => 'true'}
1 row(s) in 0.0550 seconds
hbase(main):069:0> get 'courses','tom','course:math','course:art' //查询多个列
COLUMN CELL
course:art timestamp=1403489905110, value=87
course:math timestamp=1403491556319, value=100
2 row(s) in 0.0250 seconds
hbase(main):070:0> get 'courses','tom',['course:math','course:art']//查询多个列
COLUMN CELL
course:art timestamp=1403489905110, value=87
course:math timestamp=1403491556319, value=100
2 row(s) in 0.0230 second
8.扫描所有数据
scan ‘courses’
也可以指定一些修饰词:TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH,or COLUMNS。没任何修饰词,就是上边例句,就会显示所有数据行。
hbase(main):071:0> scan 'courses'//全表扫面,实际应用这个会死人的,一般不会用
ROW COLUMN+CELL
jarry column=course:, timestamp=1403490041032, value=89
jarry column=grade:, timestamp=1403489972431, value=4
tom column=course:art, timestamp=1403489905110, value=87
tom column=course:math, timestamp=1403491556319, value=100
tom column=grade:, timestamp=1403489817691, value=5
2 row(s) in 0.0490 seconds
hbase(main):072:0> scan 'courses',{COLUMNS=>'course'}//列
ROW COLUMN+CELL
jarry column=course:, timestamp=1403490041032, value=89
tom column=course:art, timestamp=1403489905110, value=87
tom column=course:math, timestamp=1403491556319, value=100
2 row(s) in 0.0130 seconds
hbase(main):073:0> scan 'courses',{COLUMNS=>'course:math'}//列
ROW COLUMN+CELL
tom column=course:math, timestamp=1403491556319, value=100
1 row(s) in 0.0090 seconds
hbase(main):009:0> scan 'courses',{COLUMNS=>['course','grade']}//列
ROW COLUMN+CELL
jarry column=course:, timestamp=1403490041032, value=89
jarry column=grade:, timestamp=1403489972431, value=4
tom column=course:art, timestamp=1403489905110, value=87
tom column=course:math, timestamp=1403491556319, value=100
tom column=grade:, timestamp=1403489817691, value=5
2 row(s) in 0.0150 seconds
hbase(main):010:0> scan 'courses',{COLUMNS=>['course','grade'],LIMIT=>2}//列+分页
ROW COLUMN+CELL
jarry column=course:, timestamp=1403490041032, value=89
jarry column=grade:, timestamp=1403489972431, value=4
tom column=course:art, timestamp=1403489905110, value=87
tom column=course:math, timestamp=1403491556319, value=100
tom column=grade:, timestamp=1403489817691, value=5
2 row(s) in 0.0150 seconds
hbase(main):011:0> scan 'courses',{COLUMNS=>['course','grade'],LIMIT=>1} //列+分页
ROW COLUMN+CELL
jarry column=course:, timestamp=1403490041032, value=89
jarry column=grade:, timestamp=1403489972431, value=4
1 row(s) in 0.0200 seconds
hbase(main):013:0> scan 'courses',{COLUMNS=>['course','grade'],LIMIT=>2,TIMERANGE=>[1203490041032,1403490041032]} //列+分页+时间范围
ROW COLUMN+CELL
jarry column=grade:, timestamp=1403489972431, value=4
tom column=course:art, timestamp=1403489905110, value=87
tom column=grade:, timestamp=1403489817691, value=5
2 row(s) in 0.0150 seconds
过滤部分:
加点数据
hbase(main):054:0> put 'courses','tomcat','grade',12
0 row(s) in 0.0740 seconds
hbase(main):055:0> put 'courses','tomcat','course:math',67
0 row(s) in 0.1650 seconds
hbase(main):056:0> put 'courses','tomcat','course:art',56
0 row(s) in 0.0150 seconds
查询
hbase(main):058:0> scan 'courses',{FILTER=>"PrefixFilter('to')"} //过滤前缀,是对rowkey的过滤
ROW COLUMN+CELL
tom column=course:art, timestamp=1403489905110, value=87
tom column=course:math, timestamp=1403491556319, value=100
tom column=grade:, timestamp=1403489817691, value=5
tomcat column=course:art, timestamp=1403505246682, value=56
tomcat column=course:math, timestamp=1403505237535, value=67
tomcat column=grade:, timestamp=1403505215820, value=12
2 row(s) in 0.0300 seconds
hbase(main):061:0> scan 'courses',{FILTER=>"PrefixFilter('to') AND QualifierFilter(>=,'binary:b')"}//前缀过滤器和列过滤器,列过滤器对列的名称进行过滤,而不是列的值
ROW COLUMN+CELL
tom column=course:math, timestamp=1403491556319, value=100
tom column=grade:, timestamp=1403489817691, value=5
tomcat column=course:math, timestamp=1403505237535, value=67
tomcat column=grade:, timestamp=1403505215820, value=12
2 row(s) in 0.0220 secon
hbase(main):064:0> scan 'courses',{FILTER=>"(PrefixFilter('to') AND (QualifierFilter(>=,'binary:b'))) AND (TimestampsFilter(1403491556319,1403505237535))"}//rowkey+列+时间戳范围
ROW COLUMN+CELL
tom column=course:math, timestamp=1403491556319, value=100
tomcat column=course:math, timestamp=1403505237535, value=67
2 row(s) in 0.0180 seconds
scan ‘t1′, {FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}
hbase(main):065:0> scan 'courses',{FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}
ROW COLUMN+CELL
jarry column=course:, timestamp=1403490041032, value=89
tom column=course:art, timestamp=1403489905110, value=87
tomcat column=course:art, timestamp=1403505246682, value=56
3 row(s) in 0.0250 seconds
hbase(main):066:0> scan 'courses',{FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(2, 0)}
ROW COLUMN+CELL
jarry column=course:, timestamp=1403490041032, value=89
jarry column=grade:, timestamp=1403489972431, value=4
tom column=course:art, timestamp=1403489905110, value=87
tom column=course:math, timestamp=1403491556319, value=100
tomcat column=course:art, timestamp=1403505246682, value=56
tomcat column=course:math, timestamp=1403505237535, value=67
3 row(s) in 0.0270 seconds
hbase(main):067:0> scan 'courses',{FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(3, 0)}
ROW COLUMN+CELL
jarry column=course:, timestamp=1403490041032, value=89
jarry column=grade:, timestamp=1403489972431, value=4
tom column=course:art, timestamp=1403489905110, value=87
tom column=course:math, timestamp=1403491556319, value=100
tom column=grade:, timestamp=1403489817691, value=5
tomcat column=course:art, timestamp=1403505246682, value=56
tomcat column=course:math, timestamp=1403505237535, value=67
tomcat column=grade:, timestamp=1403505215820, value=12
3 row(s) in 0.0200 seconds
过滤器filter有两种方法指出:
a. Using a filterString – more information on this is available in the
Filter Language document attached to the HBASE-4176 JIRA
b. Using the entire package name of the filter.
还有一个CACHE_BLOCKS修饰词,开关scan的缓存的,默认是开启的(CACHE_BLOCKS=>true),可以选择关闭(CACHE_BLOCKS=>false)。
9.删除数据
先插入几条准备着
hbase(main):017:0> put 'courses','rolin','course:math','34'
0 row(s) in 0.0170 seconds
hbase(main):018:0> put 'courses','rolin','course:math','78'
0 row(s) in 0.0440 seconds
开始:
hbase(main):019:0> get 'courses','rolin'
COLUMN CELL
course:math timestamp=1403493162261, value=78
grade: timestamp=1403493141114, value=4
2 row(s) in 0.0220 seconds
hbase(main):020:0> delete 'courses','rolin','grade'
0 row(s) in 0.0490 seconds
hbase(main):021:0> get 'courses','rolin'
COLUMN CELL
course:math timestamp=1403493162261, value=78
1 row(s) in 0.0190 seconds
删除数据命令也没太多变化,只有一个:
hbase> delete ‘t1′, ‘r1′, ‘c1′, ts1
另外有一个deleteall命令,可以进行整行的范围的删除操作,慎用!
hbase(main):031:0> deleteall 'courses','rolin'
0 row(s) in 0.0200 seconds
hbase(main):032:0> scan 'courses'
ROW COLUMN+CELL
jarry column=course:, timestamp=1403490041032, value=89
jarry column=grade:, timestamp=1403489972431, value=4
tom column=course:art, timestamp=1403489905110, value=87
tom column=course:math, timestamp=1403491556319, value=100
tom column=grade:, timestamp=1403489817691, value=5
2 row(s) in 0.0280 seconds
如果需要进行全表删除操作,就使用truncate命令,其实没有直接的全表删除命令,这个命令也是disable,drop,create三个命令组合出来的。
10,修改表结构
disable ‘courses
alter ‘courses',NAME=>'info'
enable ‘courses'
alter命令使用如下(如果无法成功的版本,需要先通用表disable):
a、改变或添加一个列族:
hbase> alter ‘t1′, NAME => ‘f1′, VERSIONS => 5
hbase(main):011:0> alter 'courses',NAME='info',VERSIONS=>5
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
Unknown argument ignored: VERSIONS
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 4.5150 seconds
hbase(main):012:0> describe
describe describe_namespace
hbase(main):012:0> describe 'courses'
DESCRIPTION ENABLED
'courses', {NAME => 'course', DATA_BLOCK_ENCODING = true
> 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE =
> '0', COMPRESSION => 'NONE', VERSIONS => '4', TTL
=> '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_
CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY =
> 'false', BLOCKCACHE => 'true'}, {NAME => 'grade',
DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW
', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE',
VERSIONS => '1', TTL => '2147483647', MIN_VERSIONS
=> '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE =
> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'tru
e'}, {NAME => 'info', DATA_BLOCK_ENCODING => 'NONE'
, BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', V
ERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS
=> '0', TTL => '500', KEEP_DELETED_CELLS => 'false
', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOC
KCACHE => 'true'}, {NAME => 'name', DATA_BLOCK_ENCO
DING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_S
COPE => '0', COMPRESSION => 'NONE', VERSIONS => '1'
, TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DE
LETED_CELLS => 'false', BLOCKSIZE => '65536', IN_ME
MORY => 'false', BLOCKCACHE => 'true'}
1 row(s) in 0.0960 seconds
b.删除一个列族
hbase(main):022:0> alter 'courses',NAME='info',METHOD=>'delete'
c.也可以修改表属性如MAX_FILESIZE
hbase(main):022:0> alter 'courses',METHOD=>'table_att',MAX_FILESIZE=>'134217729'
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 2.1960 seconds
d.可以添加一个表协同处理器
hbase> alter ‘t1′, METHOD => ‘table_att', ‘coprocessor'=> ‘hdfs:///foo.jar|com.foo.FooRegionObserver|1001|arg1=1,arg2=2′
一个表上可以配置多个协同处理器,一个序列会自动增长进行标识。加载协同处理器(可以说是过滤程序)需要符合以下规则:
[coprocessor jar file location] | class name | [priority] | [arguments]
e、移除coprocessor如下:
hbase> alter ‘t1′, METHOD => ‘table_att_unset', NAME => ‘MAX_FILESIZE'
hbase> alter ‘t1′, METHOD => ‘table_att_unset', NAME => ‘coprocessor$1′
f、可以一次执行多个alter命令:
hbase> alter ‘t1′, {NAME => ‘f1′}, {NAME => ‘f2′, METHOD => ‘delete'}
11.统计行数:
hbase(main):029:0> count 'courses'
2 row(s) in 0.0490 seconds
=> 2
hbase(main):030:0> count 'courses',INTERVAL=>100
2 row(s) in 0.0310 seconds
=> 2
hbase(main):031:0> count 'courses',INTERVAL=>1000
2 row(s) in 0.0190 seconds
=> 2
hbase(main):032:0> count 'courses',INTERVAL=>10000
2 row(s) in 0.0280 seconds
=> 2
hbase(main):033:0> count 'courses',CACHE=>1000
2 row(s) in 0.0270 seconds
=> 2
hbase(main):034:0> count 'courses',CACHE=>1000,INTERVAL=>1000
2 row(s) in 0.0210 seconds
=> 2
count一般会比较耗时,使用mapreduce进行统计,统计结果会缓存,默认是10行。统计间隔默认的是1000行(INTERVAL)。
12.disable 和 enable 操作
很多操作需要先暂停表的可用性,比如上边说的alter操作,删除表也需要这个操作。disable_all和enable_all能够操作更多的表。
13.表的删除
先停止表的可使用性,然后执行删除命令。
drop ‘t1′
14.hbase shell脚本 既然是shell命令,当然也可以把所有的hbase shell命令写入到一个文件内,想linux shell脚本程序那样去顺序的执行所有命令。如同写linux shell,把所有hbase shell命令书写在一个文件内,然后执行如下命令即可:
$ hbase shell test.hbaseshell
[hadoop@master ~]$ vim hbase.hsh
create 'test','info'
"hbase.hsh" [New] 1L, 21C written
[hadoop@master ~]$
[hadoop@master ~]$
[hadoop@master ~]$ pwd
/home/hadoop
[hadoop@master ~]$ ls
hbase.hsh Templates
[hadoop@master ~]$ hbase shell hbase.hsh
2014-06-23 13:41:55,966 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/cloud/hbase-0.96.1.1-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/cloud/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
0 row(s) in 1.9490 seconds
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.96.1.1-hadoop2, rUnknown, Tue Dec 17 12:22:12 PST 2013
hbase(main):001:0> list
TABLE
courses
test
2 row(s) in 0.0610 seconds
=> ["courses", "test"]
hbase(main):002:0> exit