HBase全分布式环境搭建

本文详细介绍了HBase的安装过程,包括独立配置Zookeeper、下载解压HBase、配置环境变量、设置HBase-site.xml文件及启动流程,旨在帮助开发者顺利搭建HBase环境。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1. 首先安装hbase之前必须要有zookeeper,其实hbase自带了zookeeper,也就是说hbase自己托管了一个zookeeper,但是我们一般是独立配置一个zookeeper,理由如下:hbase托管zookeeper具有耦合性,这样hbase需要升级的话顺带zookeeper需要跟着一起变,所以独立配置一套zookeeper是很有必要的,这样hbase升级只需要关心hbase本身的事情,何况zookeeper不仅仅是服务于hbase的,独立安装配置zookeeper可以参考我的另一个博客:http://blog.youkuaiyun.com/jthink_/article/details/38639979

2. 下载hbase-0.94.8.tar.gz,解压到/usr/local/bg文件夹下,配置conf/hbase-env.sh文件,内容为:

#

#/**

# * Copyright 2007 The Apache Software Foundation

# *

# * Licensed to the Apache Software Foundation (ASF) under one

# * or more contributor license agreements.  See the NOTICE file

# * distributed with this work for additional information

# * regarding copyright ownership.  The ASF licenses this file

# * to you under the Apache License, Version 2.0 (the

# * "License"); you may not use this file except in compliance

# * with the License.  You may obtain a copy of the License at

# *

# *     http://www.apache.org/licenses/LICENSE-2.0

# *

# * Unless required by applicable law or agreed to in writing, software

# * distributed under the License is distributed on an "AS IS" BASIS,

# * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# * See the License for the specific language governing permissions and

# * limitations under the License.

# */



# Set environment variables here.



# This script sets variables multiple times over the course of starting an hbase process,

# so try to keep things idempotent unless you want to take an even deeper look

# into the startup scripts (bin/hbase, etc.)



# The java implementation to use.  Java 1.6 required.

export JAVA_HOME=/usr/local/bg/jdk1.7.0_60



# Extra Java CLASSPATH elements.  Optional.

# export HBASE_CLASSPATH=



# The maximum amount of heap to use, in MB. Default is 1000.

# export HBASE_HEAPSIZE=1000



# Extra Java runtime options.

# Below are what we set by default.  May only work with SUN JVM.

# For more on why as well as other possible settings,

# see http://wiki.apache.org/hadoop/PerformanceTuning

export HBASE_OPTS="-XX:+UseConcMarkSweepGC"



# Uncomment one of the below three options to enable java garbage collection logging for the server-side processes.



# This enables basic gc logging to the .out file.

# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"



# This enables basic gc logging to its own file.

# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .

# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"



# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.

# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .

# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"



# Uncomment one of the below three options to enable java garbage collection logging for the client processes.



# This enables basic gc logging to the .out file.

# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"



# This enables basic gc logging to its own file.

# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .

# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"



# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.

# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .

# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"



# Uncomment below if you intend to use the EXPERIMENTAL off heap cache.

# export HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize="

# Set hbase.offheapcache.percentage in hbase-site.xml to a nonzero value.





# Uncomment and adjust to enable JMX exporting

# See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access.

# More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html

#

# export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"

# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101"

# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102"

# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103"

# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104"



# File naming hosts on which HRegionServers will run.  $HBASE_HOME/conf/regionservers by default.

# export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers



# File naming hosts on which backup HMaster will run.  $HBASE_HOME/conf/backup-masters by default.

# export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters



# Extra ssh options.  Empty by default.

# export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"



# Where log files are stored.  $HBASE_HOME/logs by default.

# export HBASE_LOG_DIR=${HBASE_HOME}/logs



# Enable remote JDWP debugging of major HBase processes. Meant for Core Developers 

# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"

# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"

# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"

# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"



# A string representing this instance of hbase. $USER by default.

# export HBASE_IDENT_STRING=$USER



# The scheduling priority for daemon processes.  See 'man nice'.

# export HBASE_NICENESS=10



# The directory where pid files are stored. /tmp by default.

# export HBASE_PID_DIR=/var/hadoop/pids



# Seconds to sleep between slave commands.  Unset by default.  This

# can be useful in large clusters, where, e.g., slave rsyncs can

# otherwise arrive faster than the master can service them.

# export HBASE_SLAVE_SLEEP=0.1



# Tell HBase whether it should manage it's own instance of Zookeeper or not.

export HBASE_MANAGES_ZK=false
需要注意的是,HBASE_MANAGES_ZK如果设置成true就是采用hbase自带的zookeeper,这里我们设置为false因为我们已经自己安装配置了一套zookeeper

配置conf/hbase-site.xml,内容为:

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!--

/**

 * Copyright 2010 The Apache Software Foundation

 *

 * Licensed to the Apache Software Foundation (ASF) under one

 * or more contributor license agreements.  See the NOTICE file

 * distributed with this work for additional information

 * regarding copyright ownership.  The ASF licenses this file

 * to you under the Apache License, Version 2.0 (the

 * "License"); you may not use this file except in compliance

 * with the License.  You may obtain a copy of the License at

 *

 *     http://www.apache.org/licenses/LICENSE-2.0

 *

 * Unless required by applicable law or agreed to in writing, software

 * distributed under the License is distributed on an "AS IS" BASIS,

 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

 * See the License for the specific language governing permissions and

 * limitations under the License.

 */

-->

<configuration>

  <property>

    <name>hbase.rootdir</name>

    <value>hdfs://bg01:9000/hbase</value>

  </property>

  <property>

    <name>hbase.cluster.distributed</name>

    <value>true</value>

  </property>

  <property>

    <name>hbase.zookeeper.quorum</name>

    <value>bg01,bg02,bg03,bg04</value>

  </property>

  <property>

    <name>dfs.replication</name>

    <value>3</value>

  <description>

  </description>

  </property>



  <property>

    <name>hbase.zookeeper.property.dataDir</name>

    <value>/usr/local/bg/zookeeper-3.4.5/data</value> 

    <description>

    </description>

  </property>

</configuration>
其中,hbase.zookeeper.quorum这项需要配置出所有的zookeeper节点,包括我们的HMaster这个角色所在的节点(我们这里其实就是bg01这个节点),但是如果选用hbase自带的zookeeper就不需要,如果使用hbase自带的该配置文件内容为:

<configuration> 
<property> 
<name>hbase.rootdir</name> 
<value>hdfs://master:9000/hbase</value> 
</property> 
<property> 
<name>hbase.cluster.distributed</name> 
<value>true</value> 
</property> 
<property> 
<name>hbase.zookeeper.quorum</name> 
<value>bg02,bg03,bg04</value> 
</property> 
<property> 
<name>dfs.replication</name> 
<value>2</value> 
<description> 
</description> 
</property> 
</configuration> 
可以看出就是上面提到的那项不同

配置conf/regionservers,内容为:

bg02

bg03

bg04
这里指定的其实就是我们的regionserver节点

3. 把配置好的hbase文件夹scp到其他节点机器,其实就是每台机器上都有这个hbase

4. 配置完成我们就可以启动啦,因为独立的zookeeper,所以启动的时候必须确保所有节点的zookeeper正常启动,

ssh到所有节点,zkServer.sh start,然后才能打开hbase,命令为:start-hbase.sh

jps去观察节点上的进程,bg01上应该有HMaster,别的机器上应该有HRegionServer进程

当然了,启动之前必须确保hadoop已经启动

5. 关于hbase启动多说几句

启动顺序是hadoop->zookeeper->hbase

所以关闭的时候顺序应该是反的:hbase->zookeeper->hadoop

### HBase分布式环境搭建教程及配置步骤 在伪分布式环境搭建HBase,需要确保HBase运行在单台主机上,并将数据存储在HDFS中。以下是详细的配置步骤: #### 1. 前置条件 - 确保已经安装并正确配置了Hadoop和Zookeeper[^5]。 - 安装JDK并设置环境变量`JAVA_HOME`[^4]。 #### 2. HBase上传与解压 将HBase安装包上传到Linux系统中的目标目录(如`/opt`),然后解压安装包: ```bash tar -zxf hbase-1.2.0-cdh5.14.2.tar.gz ``` 将解压后的文件移动到目标路径(如`/home/hadoop/soft/hbase`)并重命名: ```bash mv hbase-1.2.0-cdh5.14.2 /home/hadoop/soft/hbase ``` #### 3. 配置环境变量 编辑`~/.bashrc`或`/etc/profile`文件,添加以下内容以设置HBase环境变量: ```bash export HBASE_HOME=/home/hadoop/soft/hbase export PATH=$PATH:$HBASE_HOME/bin ``` 执行以下命令使环境变量生效: ```bash source ~/.bashrc ``` #### 4. 配置HBase相关文件 ##### 4.1 配置`hbase-env.sh` 进入HBase的`conf`目录,编辑`hbase-env.sh`文件,设置以下参数: ```bash export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 export HBASE_MANAGES_ZK=false ``` 上述配置指定Java路径并禁用HBase自带的Zookeeper[^5]。 ##### 4.2 配置`hbase-site.xml` 编辑`hbase-site.xml`文件,添加以下内容以配置HBase与HDFS的集成: ```xml <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://localhost:9000/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/home/hadoop/zookeeper</value> </property> </configuration> ``` 上述配置中,`hbase.rootdir`指定HBase的数据存储在HDFS中,`hbase.cluster.distributed`设置为`true`表示启用伪分布式模式[^3]。 ##### 4.3 修改`regionservers`文件 编辑`regionservers`文件,添加以下内容以指定HRegionServer运行的主机名: ``` localhost ``` #### 5. 启动HBase 确保Hadoop集群已启动并正常运行后,启动HBase服务: ```bash start-hbase.sh ``` 检查HBase是否成功启动,访问Web界面(默认地址为`http://localhost:16010`)。 #### 6. 测试HBase功能 使用HBase Shell测试基本功能: ```bash hbase shell create 'test', 'cf' put 'test', 'row1', 'cf:a', 'value1' scan 'test' ``` --- ### 注意事项 - 确保Hadoop的NameNode和DataNode服务已正常启动。 - 如果使用外部Zookeeper,请确保Zookeeper服务已启动并正确配置[^5]。 ---
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值