文章目录
前言
适逢非关系数据库老师布置了课后作业,要我们搭建一个Hbase和MongoDB数据库,而我又得知Hbase可以通过集群搭建提高性能,所以我打算在自己的桌面环境下尝试一下,过程中遇到的一些问题,我会记录在这里。
环境
-
OS : Ubuntu 16.04
-
JDK : 1.8
-
Hadoop : 3.2.1
服务器 IP地址 hadoop-master 192.168.41.141 hadoop-node01 192.168.41.142 hadoop-node02 192.168.41.143
修改host文件
sudo vi /etc/hosts
加入如下几行:
192.168.41.141 hadoop-master
192.168.41.142 hadoop-node01
192.168.41.143 hadoop-node02
三台服务器的host文件都需要修改,修改之后运行命令source /etc/hosts使其生效。
配置SSH
首先需要通过创建SSH实现三台服务器之间的免密登陆。
ssh-keygen -t rsa
生成的公钥保存在~/.ssh下,此时需要把公钥放入authorized_keys,命令如下:
cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys
之后,我们应该在另外两台服务器上创建SSH,并且将自己服务器上的公钥放入Master服务器的authorized_keys里,实现三台服务器之间的免密登陆。
这里我为了方便,我直接将~/.ssh复制到了另外两台服务器的~/位置:
scp -r ~/.ssh root@hadoop-node01:~/.ssh
scp -r ~/.ssh root@hadoop-node02:~/.ssh
通过如下命令即可验证配置成功:
root@hadoop-master:/# ssh hadoop-node01
Welcome to Ubuntu 16.04 LTS (GNU/Linux 4.4.0-21-generic x86_64)
* Documentation: https://help.ubuntu.com/
234 packages can be updated.
149 updates are security updates.
Last login: Mon Sep 30 11:09:53 2019 from ::1
配置Hadoop
获取Hadoop
运行命令
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/stable/hadoop-3.2.1-src.tar.gz
解压
tar -zxvf hadoop-3.2.1-src.tar.gz
修改配置文件
hadoop-env.sh
vi hadoop-env.sh
//添加JDK安装路径
export JAVA_HOME=/usr/local/jdk/jdk1.8.0_221/
core-site.xml
vi core-site.xml
//添加如下配置
<configuration>
//文件系统用HDFS
<property>
<name>fs.default.name</name>
//namenode的地址
<value>hdfs://hadoop-master:9000</value>
</property>
<property>
//临时文件的存放路径
<name>hadoop.tmp.dir</name>
<value>/root/hadoop/hdfs/tmp</value>
</property>
</configuration>
mapred-site.xml
vi mapred-site.xml
//添加如下配置
<configuration>
//配置mapreduce运行的平台,默认为local本地平台模拟运行,而不是在集群上分布式运行,只是一个单机的程序,这里配置yarn平台运行,负责分配内存
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
vi yarn-site.xml
<configuration>
//指定yarn的resourcemanager地址
<property>
<name>yarn.resourcecemanager.hostname</name>
<value>hadoop-master</value>
</property>
//reducer获取数据方式
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
//忽略虚拟内存的检查,如果是在实体机上,并且内存够多,可以去掉
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
</configuration>
hdfs-site.xml
//添加如下内容
<configuration>
//hdfs的副本数量
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
//Hadoop NameNode运行端口,在通过192.168.41.141:50070访问
<property>
<name>dfs.namenode.http-address</name>
<value>hadoop-master:50070</value>
</property>
//存储上传数据的路径
<property>
<name>dfs.name.dir</name>
<value>/root/hadoop/hdfs/data</value>
</property>
//存储namenode的路径
<property>
<name>dfs.name.dir</name>
<value>/root/hadoop/hdfs/name</value>
</property>
//设置为false可以不用检查路径
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
workers
在Hadoop:2.9.1版本中,该配置文件为slaves,但是在3.2版本中,文件更名为workers,在部署过程中,这里尤其要注意,我就是因为没有注意,在master上开启Hadoop后,Node节点上并没有相继运行datanode和nodemanager。
vi workers
//添加如下内容
hadoop-node01
hadoop-node02
部署Hadoop
将修改后的Hadoop文件夹拷贝至Node01、Node02节点上:
scp -r /root/hadoop root@hadoop-node01:/root/
scp -r /root/hadoop root@hadoop-node02:/root/
并且修改环境变量,添加如下:
vi /etc/profile
//添加如下
export HADOOP_HOME=~/hadoop/
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib
export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH:$HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
启动Hadoop
执行start-all.sh启动Hadoop:
root@hadoop-master:~/hadoop/etc/hadoop# start-all.sh
Starting namenodes on [hadoop-master]
Starting datanodes
Starting secondary namenodes [hadoop-master]
2019-09-30 19:10:07,637 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting resourcemanager
Starting nodemanagers
在Master节点上可以通过执行jps命令查看启动的进程:
root@hadoop-master:~/hadoop/etc/hadoop# jps
20241 Jps
19857 SecondaryNameNode
126115 NodeManager
125540 DataNode
20088 ResourceManager
19626 NameNode
Node节点上:
root@hadoop-node01:~/hadoop/etc/hadoop# jps
122113 NodeManager
121974 DataNode
122942 Jps
至此,Hadoop部署成功。
可以在通过访问192.168.41.141:50070访问web页面:

配置Hbase
获取Hbase
wget http://mirror.bit.edu.cn/apache/hbase/2.2.1/hbase-2.2.1-bin.tar.gz
解压
tar -zxvf hbase-2.2.1-bin.tar.gz
重命名
mv hbase-2.2.1 hbase
配置环境变量
把Hbase的路径添加到环境变量中:
vi /etc/profile
//添加如下
export HBASE_HOME=/home/wang/hbase
export PATH=$HBASE_HOME/bin:$PATH
运行source /etc/profile使之生效。
修改配置文件(conf文件夹下)
hbase-env.sh
修改如下:
# The java implementation to use. Java 1.8+ required.
export JAVA_HOME=/usr/local/jdk/jdk1.8.0_221
# Extra Java CLASSPATH elements. Optional.
export JAVA_CLASSPATH=$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
# Where log files are stored. $HBASE_HOME/logs by default.
export HBASE_LOG_DIR=/root/hbase/logs
# Tell HBase whether it should manage it's own instance of ZooKeeper or not.
export HBASE_MANAGES_ZK=tru
hbase-site.xml
修改如下:
<configuration>
<property>
<name>hbase.master</name>
<value>hadoop-master:6000</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop-master:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/root/hbase/tmp</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/root/zookeeper</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop-master,hadoop-node01,hadoop-node02</value>
</property>
</configuration>
regionservers
修改如下:
hadoop-node01
hadoop-node02
部署Hbase
将修改后的Hbase文件夹拷贝至Node01、Node02节点上:
scp -r /root/hbase root@hadoop-node01:/root/
scp -r /root/hbase root@hadoop-node02:/root/
并且修改环境变量。
启动Hbase
执行start-hbse.sh启动Hbase:
root@hadoop-master:~/zookeeper# start-hbase.sh
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class
SLF4J: Found binding in [jar:file:/root/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class
SLF4J: Found binding in [jar:file:/root/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
hadoop-master: running zookeeper, logging to /root/hbase/logs/hbase-root-zookeeper-hadoop-master.out
hadoop-node02: running zookeeper, logging to /root/hbase/logs/hbase-root-zookeeper-hadoop-node02.out
hadoop-node01: running zookeeper, logging to /root/hbase/logs/hbase-root-zookeeper-hadoop-node01.out
running master, logging to /root/hbase/logs/hbase-root-master-hadoop-master.out
hadoop-node01: running regionserver, logging to /root/hbase/logs/hbase-root-regionserver-hadoop-node01.out
hadoop-node02: running regionserver, logging to /root/hbase/logs/hbase-root-regionserver-hadoop-node02.out
运行bash shell:

至此,集群下的Hbase搭建完成。
768

被折叠的 条评论
为什么被折叠?



