注:Kafka 2.0.x 和 2.1.x 与 ZooKeeper 3.4.5 兼容。
说明:本集群用的是三台服务器组成,我的主节点为hadoop101,其它两个节点分别为hadoop102和hadoop103(注意直接用主机名需要配置IP映射关系,前面已经配置过的就不用管,没有配置的就用IP),伪分布(单节点用的是hadoop100),配置的时候需要修改的地方直接将我的主机名修改为自己对应的主机名(IP)就行(其中典型的就是脚本编写的时候需要修改)。
这里用的是下面的版本(其它版本的安装方法和下面的一个道理,但用自己的版本要注意zookeeper、kafka、scala的兼容问题,这里用的是scala2.12.18版本)
下面开始单节点(伪分布)安装zookeeper和kafka
将apache-zookeeper-3.8.4-bin.tar.gz和kafka_2.12-3.9.0.tgz(需要zookeeper3.5.x以上)上传到存自己放软件包的文件夹
解压zookeeper到自己的软件安装目录下
[zq@hadoop100 software]$ tar -zxvf apache-zookeeper-3.8.4-bin.tar.gz -C /opt/module/
改名(可不做,但建议做)
[zq@hadoop100 software]$ cd /opt/module
[zq@hadoop100 module]$ mv apache-zookeeper-3.8.4-bin/ zookeeper-3.8.4
解压kafka到自己的软件安装目录下
[zq@hadoop100 software]$ tar -zxvf kafka_2.12-3.9.0.tgz -C /opt/module/
配置环境变量
[zq@hadoop100 module]$ cd ~
[zq@hadoop100 ~]$ cd /etc/profile.d
[zq@hadoop100 profile.d]$ sudo vim my_env.sh
添加以下内容:
#zooker
export ZK_HOME=/opt/module/zookeeper-3.8.4
export PATH=$PATH:$ZK_HOME/bin
# kafka
export KAFKA_HOME=/opt/module/kafka_2.12-3.9.0
export PATH=$PATH:$KAFKA_HOME/bin
使环境变量生效(不起作用也可重启reboot)
[zq@hadoop100 profile.d]$ source my_env.sh
在zookeeper安装路径下创建zkdata目录,用于存储临时文件
[zq@hadoop100 ~]$ cd /opt/module/zookeeper-3.8.4/
[zq@hadoop100 zookeeper-3.8.4]$ mkdir zkdata
[zq@hadoop100 zookeeper-3.8.4]$ mkdir logs
设置编号
[zq@hadoop100 zookeeper-3.8.4]$ cd zkdata/
[zq@hadoop100 zkdata]$ touch myid
[zq@hadoop100 zkdata]$ echo 1>/opt/module/zookeeper-3.8.4/zkdata/myid
配置zoo.cfg文件(该文件并不存在,需要创建或者从模板文件复制)
[zq@hadoop100 zookeeper-3.8.4]$ cd /opt/module/zookeeper-3.8.4/conf/
[zq@hadoop100 conf]$ cp zoo_sample.cfg zoo.cfg
[zq@hadoop100 conf]$ vim zoo.cfg
修改或添加如下内容:
dataDir=/opt/module/zookeeper-3.8.4/zkdata
dataLogDir=/opt/module/zookeeper-3.8.4/logs
server.1=hadoop100:2888:3888
注:hadoop100为自己的主机名,没有的就自己加上(估计dataLogDir=/opt/module/zookeeper-3.8.4/logs
需要自己加上)
启动zookeeper
[zq@hadoop100 conf]$ zkServer.sh start
查看进程
至此单节点(伪分布)zookeeper配置完成
下面配置单节点(伪分布)kafka
首先去安装目录下创建用于存放kafka的log目录
[zq@hadoop100 kafka_2.12-3.9.0]$ mkdir -p kafka-logs
配置server.properties(文件在kafka安装路径下的config目录中)
[zq@hadoop100 libs]$ cd /opt/module/kafka_2.12-3.9.0/config
[zq@hadoop100 config]$ vim server.properties
配置如下
listeners=PLAINTEXT://hadoop100:9092
log.dirs=/opt/module/kafka_2.12-3.9.0/kafka-logs
zookeeper.connect=hadoop100:2181
注:hadoop100为自己的主机名
注意这一行需要把前面的#删除然后自己加上主机名listeners=PLAINTEXT://hadoop100:9092,或者自己重新添加也行
启动kafka(先需要启动zookeeper,刚才已经启动过了)
[zq@hadoop100 libs]$ kafka-server-start.sh /opt/module/kafka_2.12-3.9.0/config/server.properties
重新开一个新窗口,查看9092端口是否已经启动:
[zq@hadoop100 ~]$ netstat -anop|grep 9092
如果启动过程中报错,请尝试下面的方法,上面的启动正常就忽略下面的步骤:
去到你的hadoop共享目录下的lib文件夹下与zookeeper相关的jar包替换为zookeeper自带的版本。
[zq@hadoop100 module]$ cd hadoop-3.1.3/
[zq@hadoop100 hadoop-3.1.3]$ cd share/hadoop/common/
[zq@hadoop100 common]$ cd lib/
[zq@hadoop100 lib]$ ll
将原来的与zookeeper相关的jar包移动到随便一个位置(不知道放哪的像我一样也行)
[zq@hadoop100 lib]$ mv zookeeper-3.4.13.jar /opt/module/hadoop-3.1.3/
去到kafka下放jar包的地方将kafka自带的jar包复制到hadoop的刚才的目录下
[zq@hadoop100 ~]$ cd /opt/module/kafka_2.12-3.9.0/libs/
[zq@hadoop100 libs]$ ll
[zq@hadoop100 libs]$ cp zookeeper-3.8.4.jar /opt/module/hadoop-3.1.3/share/hadoop/common/lib/
要关机之前请停止所有进程和服务,停止zookeeper用zkServer.sh stop,停止kafka服务在服务窗口按CTRL+c即可
至此单节点的zookeeper和kafka安装完成
下面是集群的安装
将apache-zookeeper-3.8.4-bin.tar.gz和kafka_2.12-3.9.0.tgz上传到存自己放软件包的文件夹,只需要上传到主节点机子就行(我的是hadoop101),配置好再集体分发
解压zookeeper到自己的软件安装目录下
[zq@hadoop101 software]$ tar -zxvf apache-zookeeper-3.8.4-bin.tar.gz -C /opt/module/
改名(可不改)
[zq@hadoop101 software]$ cd /opt/module/
[zq@hadoop101 module]$ mv apache-zookeeper-3.8.4-bin zookeeper-3.8.4
创建临时文件夹和日志文件夹
[zq@hadoop101 ~]$ cd /opt/module/zookeeper-3.8.4/
[zq@hadoop101 zookeeper-3.8.4]$ mkdir zkdata
[zq@hadoop101 zookeeper-3.8.4]$ mkdir logs
配置zoo.cfg文件(该文件并不存在,需要创建或者从模板文件复制)
[zq@hadoop101 zookeeper-3.8.4]$ cd conf/
[zq@hadoop101 conf]$ cp zoo_sample.cfg zoo.cfg
[zq@hadoop101 conf]$ vim zoo.cfg
修改或添加如下内容:
dataDir=/opt/module/zookeeper-3.8.4/zkdata
server.1=hadoop101:2888:3888
server.2=hadoop102:2888:3888
server.3=hadoop103:2888:3888
在/opt/module/zookeeper-3.8.4/zkdata下创建myid文件
[zq@hadoop101 zookeeper-3.8.4]$ cd zkdata/
[zq@hadoop101 zkdata]$ touch myid
给定编号
[zq@hadoop101 zkdata]$ echo 1 > /opt/module/zookeeper-3.8.4/zkdata/myid
配置好的zookeeper拷贝到其他节点
[zq@hadoop101 zkdata]$ scp -r /opt/module/zookeeper-3.8.4/ hadoop102:/opt/module/
[zq@hadoop101 zkdata]$ scp -r /opt/module/zookeeper-3.8.4/ hadoop103:/opt/module/
去到其它节点上给定编号(myid的值)
[zq@hadoop101 zkdata]$ ssh hadoop102 注:也可以不用免密连接(用ssh需要进行配置,教程看上期),直接去相应的节点(机子)改也一样
Last login: Wed Dec 4 02:36:20 2024 from 192.168.89.1
[zq@hadoop102 ~]$ cd /opt/module/zookeeper-3.8.4/zkdata/
[zq@hadoop102 zkdata]$ echo 2 > /opt/module/zookeeper-3.8.4/zkdata/myid
[zq@hadoop102 zkdata]$ exit
登出
Connection to hadoop102 closed.
[zq@hadoop101 zkdata]$ ssh hadoop103
Last login: Tue Dec 3 18:07:24 2024 from 192.168.89.1
[zq@hadoop103 ~]$ cd /opt/module/zookeeper-3.8.4/zkdata/
[zq@hadoop103 zkdata]$ echo 3 > /opt/module/zookeeper-3.8.4/zkdata/myid
Zookeeper启动
在所有节点上都需要执行启动命令,进入到zookeeper安装目录下,执行:bin/zkServer.sh start
[zq@hadoop101 zookeeper-3.8.4]$ bin/zkServer.sh start
[zq@hadoop101 zookeeper-3.8.4]$ ssh hadoop102
Last login: Tue Dec 3 20:24:23 2024 from hadoop101
[zq@hadoop102 ~]$ cd /opt/module/zookeeper-3.8.4/
[zq@hadoop102 zookeeper-3.8.4]$ bin/zkServer.sh start
[zq@hadoop102 zookeeper-3.8.4]$ exit
登出
Connection to hadoop102 closed.
[zq@hadoop101 zookeeper-3.8.4]$ ssh hadoop103
Last login: Tue Dec 3 20:26:00 2024 from hadoop101
[zq@hadoop103 ~]$ cd /opt/module/zookeeper-3.8.4/
[zq@hadoop103 zookeeper-3.8.4]$ bin/zkServer.sh start
编写zookeeper启动脚本
[zq@hadoop101 config]$ cd ~
[zq@hadoop101 ~]$ cd bin/
[zq@hadoop101 bin]$ vim zookeeper
[zq@hadoop101 bin]$ chmod +x zookeeper
如下:(根据自己的安装目录更改相应目录即可,以及更换节点(机子/服务器)主机名即可)
#!/bin/bash
case $1 in
"start"){
for i in hadoop101 hadoop102 hadoop103
do
echo ---------- zookeeper $i 启动 ------------
ssh $i "/opt/module/zookeeper-3.8.4/bin/zkServer.sh start"
done
};;
"stop"){
for i in hadoop101 hadoop102 hadoop103
do
echo ---------- zookeeper $i 停止 ------------
ssh $i "/opt/module/zookeeper-3.8.4/bin/zkServer.sh stop"
done
};;
"status"){
for i in hadoop101 hadoop102 hadoop103
do
echo ---------- zookeeper $i 状态 ------------
ssh $i "/opt/module/zookeeper-3.8.4/bin/zkServer.sh status"
done
};;
*)
echo "Input Args Error..."
echo "$0 [start|stop|status]..."
;;
esac
注:启动用zookeeper start,关闭用zookeeper stop,查看状态用zookeeper status
至此zookeeper在集群安装完成
下面是kafka在集群安装
在 Kafka_2.12-3.9.0 版本中,Kafka 仍然依赖 ZooKeeper 来存储集群、brokers、consumer 等相关元信息。所以我们将用外部的zookeeper来作为kafka的元数据库(不知道的可以不用管,与下面配置无关)
解压kafka到自己的软件安装目录下
[zq@hadoop101 software]$ tar -zxvf kafka_2.12-3.9.0.tgz -C /opt/module/
创建用于存放kafka的log目录
[zq@hadoop101 software]$ cd /opt/module/kafka_2.12-3.9.0/
[zq@hadoop101 kafka_2.12-3.9.0]$ mkdir -p kafka-logs
配置server.properties(文件在kafka安装路径下的config目录中)
配置如下
broker.id=0
log.dirs=/opt/module/kafka_2.12-3.9.0/kafka-logs
zookeeper.connect=hadoop101:2181,hadoop102:2181,hadoop103:2181/kafka
group.initial.rebalance.delay.ms=0
分发整个Kafka安装目录到其他节点
[zq@hadoop101 module]$ scp -r /opt/module/kafka_2.12-3.9.0/ hadoop102:/opt/module/
[zq@hadoop101 module]$ scp -r /opt/module/kafka_2.12-3.9.0/ hadoop103:/opt/module/
到其他两个节点分别修改server.properties中的broker.id为:1和2
[zq@hadoop101 module]$ ssh hadoop102
Last login: Tue Dec 3 20:30:32 2024 from hadoop101
[zq@hadoop102 ~]$ cd /opt/module/kafka_2.12-3.9.0/config/
[zq@hadoop102 config]$ vim server.properties
[zq@hadoop102 config]$ exit
登出
Connection to hadoop102 closed.
[zq@hadoop101 module]$ ssh hadoop103
Last login: Tue Dec 3 20:32:18 2024 from hadoop101
[zq@hadoop103 ~]$ cd /opt/module/kafka_2.12-3.9.0/config/
[zq@hadoop103 config]$ vim server.properties
解决jar包冲突(单节点启动kafka的时候有这个问题的现在也要在每个节点机上解决一下,如果单节点启动的时候没问题就跳过这一步)
hadoop101上:
[zq@hadoop101 zookeeper-3.8.4]$ cd /opt/module/hadoop-3.1.3/share/hadoop/common/lib/
[zq@hadoop101 lib]$ mv zookeeper-3.4.13.jar /opt/module/hadoop-3.1.3/
[zq@hadoop101 lib]$ cd /opt/module/kafka_2.12-3.9.0/libs/
[zq@hadoop101 libs]$ cp zookeeper-3.8.4.jar /opt/module/hadoop-3.1.3/share/hadoop/common/lib/
hadoop102上:
[zq@hadoop102 zookeeper-3.8.4]$ cd /opt/module/hadoop-3.1.3/share/hadoop/common/lib/
[zq@hadoop102 lib]$ mv zookeeper-3.4.13.jar /opt/module/hadoop-3.1.3/
[zq@hadoop102 lib]$ cd /opt/module/kafka_2.12-3.9.0/libs/
[zq@hadoop102 libs]$ cp zookeeper-3.8.4.jar /opt/module/hadoop-3.1.3/share/hadoop/common/lib/
hadoop103上:
[zq@hadoop103 zookeeper-3.8.4]$ cd /opt/module/hadoop-3.1.3/share/hadoop/common/lib/
[zq@hadoop103 lib]$ mv zookeeper-3.4.13.jar /opt/module/hadoop-3.1.3/
[zq@hadoop103 lib]$ cd /opt/module/kafka_2.12-3.9.0/libs/
[zq@hadoop103 libs]$ cp zookeeper-3.8.4.jar /opt/module/hadoop-3.1.3/share/hadoop/common/lib/
编kafka启动脚本
[zq@hadoop101 config]$ cd ~
[zq@hadoop101 ~]$ cd bin/
[zq@hadoop101 bin]$ vim kafka
[zq@hadoop101 bin]$ chmod +x kafka
如下:(根据自己的安装目录更改相应目录即可,以及更换节点(机子/服务器)主机名即可)
#!/bin/bash
kafka_start() {
for i in hadoop101 hadoop102 hadoop103
do
echo " --------启动 $i Kafka-------"
ssh $i "/opt/module/kafka_2.12-3.9.0/bin/kafka-server-start.sh -daemon /opt/module/kafka_2.12-3.9.0/config/server.properties"
done
}
kafka_stop() {
for i in hadoop101 hadoop102 hadoop103
do
echo " --------关闭 $i Kafka-------"
ssh $i "/opt/module/kafka_2.12-3.9.0/bin/kafka-server-stop.sh "
done
}
case $1 in
"start")
kafka_start
;;
"stop")
kafka_stop
;;
"restart")
kafka_stop
kafka_start
;;
*)
echo "Input Args Error..."
echo "$0 [start|stop|restart]..."
;;
esac
启动kafka集群先启动(先启动zookeeper,先前已启动)
[zq@hadoop101 ~]$ kafka start 注:关闭为kafka stop,重启用kafka restart
用上期编写过的集群进程查看脚本jpsall查看是否有kafka和zookeeper进程
如果要关闭如下(推荐先关闭kafka,再关闭zookeeper,因为kafka启动的前提是zookeeper要启动,为了避免不必要的麻烦,建议先关闭kafka)
[zq@hadoop101 ~]$ kafka stop
[zq@hadoop101 ~]$ zookeeper stop
至此集群上的zookeeper和kafka安装完成