大数据基础-hbase2.3.0安装教程
1、准备
这里搭建的hbase集群是以完全分布式的方式搭建,并将hbase数据存储在hadoop文件系统中。
hbase集群搭建需要准备以下内容:
2、安装
2.1 安装zookeeper
zookeeper安装参考往期文章:大数据基础-zookeeper的安装教程
2.2 安装hadoop
hadoop安装参考往期文章
2.3 安装hbase
我们将hadoop安装在/opt/hadoop 目录下。
(1)上传文件
将安装包hbase-2.3.7-bin.tar.gz,上传到/opt/software目录下。
[xikuang@hadoop102 ~]$ cd /opt/software/
(2)解压文件
进入/opt/software目录中,找到hbase-2.3.7-bin.tar.gz,并解压。
[xikuang@hadoop102 software]$ tar -zxvf hbase-2.3.7-bin.tar.gz -C /opt/module/
(3)重命名
将解压后的hbase-2.3.7文件夹重命名为hbase
[xikuang@hadoop102 module]$ mv hbase-2.3.7/ hbase
(4)建立补充目录 logs 和tmp
[xikuang@hadoop102 hbase]$ mkdir logs
[xikuang@hadoop102 hbase]$ mkdir tmp
3、配置
hbase的配置只需配置hbase-env.sh、hbase-site.xml、regionservers和环境变量。
hbase的配置文件主要存放在/opt/module/hbase/conf目录下
序号 文件名 说明
1 hbase-env.sh hbase的环境变量配置文件
2 hbase-site.xml hbase核心配置
3 regionservers 节点配置
4 /etc/profile 环境变量
3.1 配置hbase-env.sh
在hbase-env.sh的配置中,需要增加如下两行内容,一是jdk的变量位置,另一个是控制hbase是否启用hbase的自带zookeeper。
[xikuang@hadoop102 opt]$ cd module/hbase/
[xikuang@hadoop102 hbase]$ vim conf/hbase-env.sh
添加下面内容
# 安装的java目录
export JAVA_HOME=/opt/module/jdk1.8.0_212
# hadoop安装目录
export HBASE_CLASSPATH=/opt/module/hadoop-3.1.3/etc/hadoop
export HBASE_LOG_DIR=/opt/module/hbase/logs
export HBASE_MANAGES_ZK=false
export HBASE_PID_DIR=/opt/module/hbase/pids
3.2 配置hbase-site.xml
在hbase-site.xml中增加如下内容:
是hbase的配置文件 /opt/module/hbase/conf
不要配错,不要配到其他目录的这个文件
<!-- 指定hbase是分布式的 -->
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
</property>
<!-- 存储路径 -->
<property>
<name>hbase.tmp.dir</name>
<value>/opt/module/hbase/tmp</value>
</property>
<!-- 指定hbase在HDFS上存储的路径,注意 hbase.rootdir要跟hadoop中的配置一致(即和 /u01/hadoop-3.2.2/etc/hadoop/core-site.xml 中的fs.defaultFS 配置的IP和端口是一致的! -->
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop102:8020/hbase</value>
</property>
<!-- 指定zk的地址,多个用“,”分割 -->
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop102:2181,hadoop103:2181,hadoop104:2181</value>
</property>
<!-- 节点之间时间误差-->
<property>
<name>hbase.master.maxclockskew</name>
<value>60000</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/hbase/master</value>
</property>
3.3 配置regionservers
通常应该将regionservers配置为datanode相同的server上以实现本地存储,提升性能
每行是一个主机名,每行只能写一个。
[xikuang@hadoop102 hbase]$ vim conf/regionservers
hadoop102
hadoop103
hadoop104
3.4 配置环境变量
增加hbase的环境变量,将以下Hbase环境变量写入/etc/profile中
#hbase
[xikuang@hadoop102 ~]$ sudo vim /etc/profile
# Hbase
export HBASE_HOME=/opt/module/hbase
export PATH=.:${HBASE_HOME}/bin:${PATH}
应用刷新 profile
[xikuang@hadoop102 ~]$ source /etc/profile
3.5 分发hbase到其他集群
[xikuang@hadoop102 module]$ xsync hbase/
[xikuang@hadoop102 module]$ sudo xsync /etc/profile
环境变量生效 hadoop103 hadoop104
[xikuang@hadoop103 module]$ source /etc/profile
[xikuang@hadoop104 ~]$ source /etc/profile
3.6 测试配置是否成功
hbase version
[xikuang@hadoop102 ~]$ hbase version
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/module/hadoop-3.1.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/module/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase 2.3.7
Source code repository git://bc84a1a3c651/home/vagrant/hbase-rm/output/hbase revision=8b2f5141e900c851a2b351fccd54b13bcac5e2ed
Compiled by vagrant on Tue Oct 12 16:38:55 UTC 2021
From source with checksum c18a9f329233d7fbbe4938009977da0b1ce243a38c66dafaf1b7f8820e412969ee3e6bff6ce33657226e4d82eaaef31277e18097ed344ee76c54db6fc4020b37
4、启动与关闭
(1)启动
启动顺序是 先启动hadoop ==> 再启动zookeeper ==> 最后启动hbase。
hbase的启动命令如下:
[xikuang@hadoop102 module]$ cd /opt/module/hbase/
[xikuang@hadoop102 hbase]$ bin/start-hbase.sh
(2)关闭
关闭的顺序是 先关闭hbase ==> 再关闭zookeeper ==> 最后关闭hadoop。
hbase的关闭命令如下:
[xikuang@hadoop102 /]$ cd /opt/module/hbase/
[xikuang@hadoop102 hbase]$ bin/stop-hbase.sh
(1)打开web页面
再浏览器上输入ip:http://hadoop102 :16010/
(2)验证hbase
打开hbase的交互shell,输入命令hbase shell
hbase shell
5. HBase Shell 操作
2.2.1 基本操作
1.进入 HBase 客户端命令行
[xikuang@hadoop102 hbase]$ bin/hbase shell
2.查看帮助命令
hbase(main):001:0> help
3.查看当前数据库中有哪些表
hbase(main):002:0> list
2.2.2 表的操作
1.创建表
hbase(main):003:0> create 'student','info'
Created table student
Took 1.2849 seconds
=> Hbase::Table - student
2.插入数据到表
hbase(main):004:0> put 'student','1001','info:sex','male'
Took 0.2774 seconds
hbase(main):005:0> put 'student','1001','info:age','18'
Took 0.0165 seconds
hbase(main):006:0> put 'student','1002','info:name','janna'
Took 0.0138 seconds
hbase(main):007:0> put 'student','1002','info:sex','female'
Took 0.0147 seconds
3**.扫描查看表数据**
hbase(main):008:0> scan 'student'
ROW COLUMN+CELL
1001 column=info:age, timestamp=2021-12-25T20:18:12.157, value=18
1001 column=info:sex, timestamp=2021-12-25T20:17:45.258, value=male
1002 column=info:name, timestamp=2021-12-25T20:18:44.270, value=janna
1002 column=info:sex, timestamp=2021-12-25T20:19:00.735, value=female
2 row(s)
Took 0.1163 seconds
4**.查看表结构**
hbase(main):009:0> describe 'student'
Table student is ENABLED
student
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_V
ERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
1 row(s)
Quota is disabled
Took 1.2089 seconds
5.更新指定字段的数据
hbase(main):010:0> put 'student','1001','info:name','Nick'
Took 0.2178 seconds
hbase(main):011:0> put 'student','1001','info:age','100'
Took 0.0151 seconds
6**.查看“指定行”或“指定列族****😗*列”的数据
hbase(main):016:0> get 'student','1001','info:name'
COLUMN CELL
info:name timestamp=2021-12-29T22:22:50.037, value=Nick
1 row(s)
Took 0.0532 seconds
7**.统计表数据行数**
hbase(main):017:0> count 'student'
2 row(s)
Took 0.0590 seconds
=> 2
8.删除数据
删除某 rowkey 的全部数据:
hbase(main):019:0> deleteall 'student','1001'
Took 0.0142 seconds
hbase(main):020:0> scan 'student'
ROW COLUMN+CELL
1002 column=info:name, timestamp=2021-12-29T22:21:25.603, value=janna
1002 column=info:sex, timestamp=2021-12-29T22:21:31.929, value=female
1 row(s)
Took 0.0193 seconds
删除某 rowkey 的某一列数据:
hbase(main):021:0> delete 'student','1002','info:sex'
Took 0.0137 seconds
hbase(main):022:0> scan 'student'
ROW COLUMN+CELL
1002 column=info:name, timestamp=2021-12-29T22:21:25.603, value=janna
1 row(s)
Took 0.0092 seconds
9.清空表数据
hbase(main):023:0> truncate 'student'
Truncating 'student' table (it may take a while):
Disabling table...
Truncating table...
Took 2.4709 seconds
hbase(main):024:0> scan 'student'
ROW COLUMN+CELL
0 row(s)
Took 0.1387 seconds
提示:清空表的操作顺序为先 disable,然后再 truncate。
10.删除表
首先需要先让该表为 disable 状态:
hbase(main):019:0> disable 'student'
然后才能 drop 这个表:
hbase(main):020:0> drop 'student'
提示:如果直接 drop 表,会报错:ERROR: Table student is enabled. Disable it first.