利用 vm虚拟机 3个节点 在上面安装 apache-zookeeper-3.8.1 Hadoop 3.3.5 apache-hive-3.1.3 和 hbase-2.5.4 列出详细

以下是在 3 个虚拟机节点上安装 ZooKeeper 3.8.1、Hadoop 3.3.5、Hive 3.1.3 和 HBase 2.5.4 的完整步骤,涵盖环境配置、集群部署和组件集成:


环境准备

1. 虚拟机配置
  • 操作系统:CentOS 7/8 或 Ubuntu 20.04 LTS(本文以 CentOS 7 为例)

  • 节点规划

    主机名IP 地址角色
    node1192.168.1.101ZooKeeper, NameNode, HMaster, Hive Metastore
    node2192.168.1.102ZooKeeper, DataNode, RegionServer
    node3192.168.1.103ZooKeeper, DataNode, RegionServer
2. 基础配置(所有节点执行)

bash

# 1. 设置主机名和 hosts 解析
sudo hostnamectl set-hostname node1  # node2/node3 分别设置
echo "192.168.1.101 node1
192.168.1.102 node2
192.168.1.103 node3" | sudo tee -a /etc/hosts

# 2. 关闭防火墙和 SELinux
sudo systemctl stop firewalld
sudo systemctl disable firewalld
sudo setenforce 0
sudo sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config

# 3. 安装 JDK 8+(需 Hadoop 兼容版本)
sudo yum install -y java-1.8.0-openjdk-devel
echo 'export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk' | sudo tee -a /etc/profile
echo 'export PATH=$JAVA_HOME/bin:$PATH' | sudo tee -a /etc/profile
source /etc/profile

# 4. 配置 SSH 免密登录(在 node1 执行)
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
ssh-copy-id node1
ssh-copy-id node2
ssh-copy-id node3

一、安装 ZooKeeper 3.8.1 集群

1. 下载并解压(所有节点)

bash

wget https://archive.apache.org/dist/zookeeper/zookeeper-3.8.1/apache-zookeeper-3.8.1-bin.tar.gz
tar -zxvf apache-zookeeper-3.8.1-bin.tar.gz -C /opt/
sudo mv /opt/apache-zookeeper-3.8.1-bin /opt/zookeeper
2. 配置 ZooKeeper

bash

# 1. 创建数据和日志目录
sudo mkdir -p /opt/zookeeper/data
sudo mkdir -p /opt/zookeeper/logs

# 2. 配置 zoo.cfg(所有节点)
cat << EOF | sudo tee /opt/zookeeper/conf/zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/zookeeper/data
dataLogDir=/opt/zookeeper/logs
clientPort=2181
server.1=node1:2888:3888
server.2=node2:2888:3888
server.3=node3:2888:3888
EOF

# 3. 创建 myid 文件(各节点不同)
# node1:
echo "1" | sudo tee /opt/zookeeper/data/myid
# node2:
echo "2" | sudo tee /opt/zookeeper/data/myid
# node3:
echo "3" | sudo tee /opt/zookeeper/data/myid

# 4. 配置环境变量
echo 'export ZOOKEEPER_HOME=/opt/zookeeper' | sudo tee -a /etc/profile
echo 'export PATH=$ZOOKEEPER_HOME/bin:$PATH' | sudo tee -a /etc/profile
source /etc/profile
3. 启动 ZooKeeper 集群

bash

# 所有节点启动
/opt/zookeeper/bin/zkServer.sh start
# 检查状态
/opt/zookeeper/bin/zkServer.sh status

二、安装 Hadoop 3.3.5 集群

1. 下载并解压(所有节点)

bash

wget https://archive.apache.org/dist/hadoop/common/hadoop-3.3.5/hadoop-3.3.5.tar.gz
tar -zxvf hadoop-3.3.5.tar.gz -C /opt/
sudo mv /opt/hadoop-3.3.5 /opt/hadoop
2. 配置 Hadoop

bash

# 1. 配置环境变量
echo 'export HADOOP_HOME=/opt/hadoop' | sudo tee -a /etc/profile
echo 'export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH' | sudo tee -a /etc/profile
source /etc/profile

# 2. 修改配置文件(所有节点)
# core-site.xml
cat << EOF | sudo tee /opt/hadoop/etc/hadoop/core-site.xml
<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://node1:9000</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/opt/hadoop/tmp</value>
  </property>
</configuration>
EOF

# hdfs-site.xml
cat << EOF | sudo tee /opt/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
  <property>
    <name>dfs.replication</name>
    <value>2</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>/opt/hadoop/data/namenode</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>/opt/hadoop/data/datanode</value>
  </property>
</configuration>
EOF

# mapred-site.xml
cat << EOF | sudo tee /opt/hadoop/etc/hadoop/mapred-site.xml
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>
EOF

# yarn-site.xml
cat << EOF | sudo tee /opt/hadoop/etc/hadoop/yarn-site.xml
<configuration>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>node1</value>
  </property>
</configuration>
EOF

# workers(所有节点)
echo "node1
node2
node3" | sudo tee /opt/hadoop/etc/hadoop/workers
3. 初始化并启动 HDFS/YARN

bash

# 1. 创建数据目录(所有节点)
sudo mkdir -p /opt/hadoop/data/{namenode,datanode}
sudo mkdir -p /opt/hadoop/tmp

# 2. 格式化 HDFS(仅在 node1 执行)
hdfs namenode -format

# 3. 启动 HDFS(node1 执行)
start-dfs.sh

# 4. 启动 YARN(node1 执行)
start-yarn.sh

# 5. 验证
jps  # 检查 NameNode、DataNode、ResourceManager、NodeManager 进程
hdfs dfs -ls /  # 查看 HDFS 根目录

三、安装 HBase 2.5.4 集群

1. 下载并解压(所有节点)

bash

wget https://archive.apache.org/dist/hbase/2.5.4/hbase-2.5.4-bin.tar.gz
tar -zxvf hbase-2.5.4-bin.tar.gz -C /opt/
sudo mv /opt/hbase-2.5.4 /opt/hbase
2. 配置 HBase

bash

# 1. 环境变量
echo 'export HBASE_HOME=/opt/hbase' | sudo tee -a /etc/profile
echo 'export PATH=$HBASE_HOME/bin:$PATH' | sudo tee -a /etc/profile
source /etc/profile

# 2. 修改配置文件(所有节点)
# hbase-site.xml
cat << EOF | sudo tee /opt/hbase/conf/hbase-site.xml
<configuration>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
  </property>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://node1:9000/hbase</value>
  </property>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>node1,node2,node3</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/opt/zookeeper/data</value>
  </property>
</configuration>
EOF

# regionservers(所有节点)
echo "node2
node3" | sudo tee /opt/hbase/conf/regionservers

# 3. 同步 Hadoop 配置文件到 HBase
sudo cp /opt/hadoop/etc/hadoop/core-site.xml /opt/hbase/conf/
sudo cp /opt/hadoop/etc/hadoop/hdfs-site.xml /opt/hbase/conf/
3. 启动 HBase 集群

bash

# 在 node1 启动 HMaster
start-hbase.sh

# 检查进程
jps  # 应有 HMaster(node1)和 HRegionServer(node2/node3)

# 验证
hbase shell
> list  # 查看表列表

四、安装 Hive 3.1.3

1. 下载并解压(仅在 node1 安装)

bash

wget https://archive.apache.org/dist/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz
tar -zxvf apache-hive-3.1.3-bin.tar.gz -C /opt/
sudo mv /opt/apache-hive-3.1.3-bin /opt/hive
2. 配置 Hive

bash

# 1. 环境变量
echo 'export HIVE_HOME=/opt/hive' | sudo tee -a /etc/profile
echo 'export PATH=$HIVE_HOME/bin:$PATH' | sudo tee -a /etc/profile
source /etc/profile

# 2. 配置 hive-site.xml
sudo cp /opt/hive/conf/hive-default.xml.template /opt/hive/conf/hive-site.xml
cat << EOF | sudo tee /opt/hive/conf/hive-site.xml
<configuration>
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://node1:3306/hive_metastore?createDatabaseIfNotExist=true</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>hive</value>
  </property>
  <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
  </property>
  <property>
    <name>hive.metastore.uris</name>
    <value>thrift://node1:9083</value>
  </property>
</configuration>
EOF

# 3. 安装 MySQL(node1 执行)
sudo yum install -y mariadb-server mariadb
sudo systemctl start mariadb
sudo systemctl enable mariadb

# 4. 创建 Hive 元数据库
mysql -u root -p
CREATE DATABASE hive_metastore;
CREATE USER 'hive'@'%' IDENTIFIED BY 'hive';
GRANT ALL PRIVILEGES ON hive_metastore.* TO 'hive'@'%';
FLUSH PRIVILEGES;
exit

# 5. 下载 MySQL JDBC 驱动
wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.49.tar.gz
tar -zxvf mysql-connector-java-5.1.49.tar.gz
sudo cp mysql-connector-java-5.1.49/mysql-connector-java-5.1.49.jar /opt/hive/lib/
3. 初始化并启动 Hive

bash

# 1. 初始化元数据库
schematool -dbType mysql -initSchema

# 2. 启动 Hive Metastore
hive --service metastore &

# 3. 启动 Hive CLI
hive
> CREATE TABLE test (id INT);
> SHOW TABLES;

五、验证组件集成

1. Hive 读取 HBase 数据

bash

# 在 HBase 创建表
hbase shell
> create 'hbase_table', 'cf'
> put 'hbase_table', 'row1', 'cf:name', 'Alice'

# 在 Hive 中映射 HBase 表
hive
CREATE EXTERNAL TABLE hive_hbase_table (key STRING, name STRING)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,cf:name')
TBLPROPERTIES ('hbase.table.name' = 'hbase_table');

SELECT * FROM hive_hbase_table;  # 应显示 HBase 数据

关键问题排查

  1. 端口冲突:检查 2181(ZooKeeper)、9000(HDFS)、16010(HBase Web UI)等端口是否被占用。

  2. 日志分析:查看各组件日志(如 /opt/hadoop/logs//opt/hbase/logs/)。

  3. 防火墙:确保节点间所有必要端口开放。

通过以上步骤,即可在三节点集群中完整部署 Hadoop 生态系统组件。

参考1:

https://download.youkuaiyun.com/download/v15220/85514544?

参考2

独步秋风-优快云博客

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值