使用最新的Hadoop集群与HBase搭建一个分布式的运行环境,最新的Hadoop稳定版本是2.2.0,HBase 的稳定版本是0.94.14 ,搭建过程如下:
1. 安装Hadoop
这个步骤,我的blog hadoop集群安装
2. 安装Hbase
HBase 可以用3个模式之一来安装,分别是:
- 独立模式( Standalone Mode)
- 伪分布式模式( Pseudo-Distributed Mode)
- 完全分布式集群模式(Fully-Distributed Mode)
HBase 默认情况下,它自己管理自己的一个Zookeeper集群,它作为Zookeeper的一种 嵌入模式运行。即,Hbase内部管理 Zookeeper;如果需要 Zookeeper在外部管理,需要在配置中设置
export HBASE_MANAGES_ZK=false
2.1 准备工作
2.1.1 编译Hbase对应Hadoop版本的Hbase
下载稳定HBase数据库
稳定版本是0.94.14,估计不久后会变成0.96
修改pom.xml文件, 集成对应的hadoop版本.
+++ pom.xml (working copy)
@@ -1034,7 +1034,7 @@ <slf4j.version>1.4.3</slf4j.version> <log4j.version>1.2.16</log4j.version> <mockito-all.version>1.8.5</mockito-all.version> - <protobuf.version>2.4.0a</protobuf.version> + <protobuf.version>2.5.0</protobuf.version> <stax-api.version>1.0.1</stax-api.version> <thrift.version>0.8.0</thrift.version> <zookeeper.version>3.4.5</zookeeper.version> @@ -2241,7 +2241,7 @@ </property> </activation> <properties> - <hadoop.version>2.0.0-alpha</hadoop.version> + <hadoop.version>2.2.0</hadoop.version> <slf4j.version>1.6.1</slf4j.version> </properties> <dependencies>
具体的Hadoop版本匹配新增:
HBase-0.92.x | HBase-0.94.x | HBase-0.96.0 | HBase-0.98.0 | |
---|---|---|---|---|
Hadoop-0.20.205 | S | X | X | X |
Hadoop-0.22.x | S | X | X | X |
Hadoop-1.0.0-1.0.2[a] | S | S | X | X |
Hadoop-1.0.3+ | S | S | S | X |
Hadoop-1.1.x | NT | S | S | X |
Hadoop-0.23.x | X | S | NT | X |
Hadoop-2.0.x-alpha | X | NT | X | X |
Hadoop-2.1.0-beta | X | NT | S | X |
Hadoop-2.2.0 | X | NT[b] | S | S |
Hadoop-2.x | X | NT | S | S |
Where
S = supported and tested, |
X = not supported, |
NT = it should run, but not tested enough. |
修改完成后,运行maven脚本
mvn clean install assembly:single -Dhadoop.profile=2.0 -DskipTests
2.2 配置OS系统
配置host文件,假设有四台集群,一个作为Master,以外三个作为RegionServer。
192.168.177.168 machine-2
192.168.177.167 machine-1
192.168.177.158 machine-0
192.168.177.172 hadoop-master hbase-master
2.3 配置HBase
编辑 hbase-env.sh文件,设置Java环境和Zookeeper的管理方式
vim hbase-env.sh
export JAVA_HOME=your_java_home
export HBASE_MANAGES_ZK=false
编辑
hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop-master:9000/hbase</value>
<description>The directory shared by region servers.</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed
Zookeeper true: fully-distributed with unmanaged Zookeeper
Quorum (see hbase-env.sh)
</description>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2222</value>
<description>Property from ZooKeeper's config zoo.cfg.
The port at which the clients will connect.
</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>machine-0,machine-1,machine-2</value>
<description>Comma separated list of servers in the ZooKeeper Quorum.
For example,
"host1.mydomain.com,host2.mydomain.com".
By default this is set to localhost for local and
pseudo-distributed modes of operation. For a
fully-distributed setup, this should be set to a
full
list of ZooKeeper quorum servers. If
HBASE_MANAGES_ZK is set in hbase-env.sh
this is the list of servers which we will start/s
top
ZooKeeper on.
</description>
</property>
</configuration>
2.4 将配置好的HBase实例分发到各台机器上
#machine-0
scp -r /opt/hbase machine-0:/opt/
#machine-1
scp -r /opt/hbase machine-1:/opt/
#machine-2
scp -r /opt/hbase machine-2:/opt/
master机上,配置regionservers:
machine-0
machine-1
machine-2
在Hadoop-master上启动HBase:
#注意,首先要启动Hadoo集群
/opt/hbase/bin/start-hbase.sh
2.5 测试HBase
查看HBase-master 运行的进程:
[app@hadoop-master ~]$ jps
3453 Jps
3166 HMaster
2779 ResourceManager
2022 Bootstrap
2466 NameNode
2618 SecondaryNameNode
3453 Jps
3166 HMaster
2779 ResourceManager
2022 Bootstrap
2466 NameNode
2618 SecondaryNameNode
显示如上的信息说明,服务进程已经启动了。
通过浏览器浏览HBase集群的运行状况,打开浏览器,输入:
当然, 使用HBase Shell命令来创建几个表也是可以的(slf4J的配置有点冗余,主要是HBase的pom.xml配置的修改,后续会fix掉的)。
[app@hadoop-master ~]$ hbase shell
14/01/10 20:19:33 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.94.14, rUnknown, Wed Jan 8 04:02:25 EST 2014
hbase(main):001:0> list
TABLE
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hbase/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
car
weblogs
2 row(s) in 3.1790 seconds
PS:这个安装过程,本人已经安装成功了。如果有任何遗漏或者不足,请指正。
转载请注明出处,谢谢!