kafka 部署 & 开发 & 通用模块（非常有用）

最新推荐文章于 2025-09-21 15:44:16 发布

原创

最新推荐文章于 2025-09-21 15:44:16 发布 · 1.1k 阅读

5 ·

CC 4.0 BY-SA版权

本文详细介绍了Kafka的安装步骤，包括单机和集群模式下的安装，并提供了完整的配置示例。此外，还深入剖析了Kafka的工作原理，涵盖消息的发送、存储与消费流程。

狂创客圈经典图书：《Netty Zookeeper Redis 高并发实战》面试必备 + 面试必备 + 面试必备【博客园总入口】
疯狂创客圈经典图书：《SpringCloud、Nginx高并发核心编程》大厂必备 + 大厂必备 + 大厂必备【博客园总入口】
入大厂+涨工资必备：高并发【亿级流量IM实战】实战系列【 SpringCloud Nginx秒杀】实战系列【博客园总入口】

《SpringCloud Nginx 高并发核心编程》环境搭建 - 系列

组件	链接地址
【必须】虚拟机Linux 开发环境准备	windows vmware 扩展硬盘 + 共享文件
Linux 自启动假死自启定时自启	Linux 自启动假死启动
【必须】Linux Redis 安装（带视频）	Linux Redis 安装（带视频）
【必须】Linux Zookeeper 安装（带视频）	Linux Zookeeper 安装, 带视频
Windows Redis 安装（带视频）	Windows Redis 安装（带视频）
RabbitMQ 离线安装（带视频）	RabbitMQ 离线安装（带视频）
ElasticSearch 安装, 带视频	ElasticSearch 安装, 带视频
Nacos 安装（带视频）	Nacos 安装（带视频）
【必须】Eureka	Eureka 入门，带视频
【必须】springcloud Config 入门，带视频	springcloud Config 入门，带视频
【必须】SpringCloud 脚手架打包与启动	SpringCloud脚手架打包与启动

1 Apache Kafka 简介

Kafka是最初由Linkedin公司开发，是一个分布式、分区的、多副本的、多订阅者，基于zookeeper协调的分布式日志系统（也可以当做MQ系统），常见可以用于web/nginx日志、访问日志，消息服务等等，Linkedin于2010年贡献给了Apache基金会并成为顶级开源项目。

2 Apache Kafka 安装

1.1 - 验证Java是否安装

希望你已经在你的机器上安装了java，所以你只需使用下面的命令验证它。

$ java -version

如果java在您的机器上成功安装，您可以看到已安装的Java的版本。

1.2 - 验证ZooKeeper是否安装

Apache Kafka 的运行依赖了ZooKeeper，所以安装前，需要检查ZooKeeper是否已经安装

验证ZooKeeper安装命令为：

/work/zookeeper/zookeeper-1/bin/zkServer.sh  status

具体的结果如下：


[root@localhost work]# /work/zookeeper/zookeeper-1/bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /work/zookeeper/zookeeper-1/bin/../conf/zoo.cfg
Mode: follower
[root@localhost work]# /work/zookeeper/zookeeper-2/bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /work/zookeeper/zookeeper-2/bin/../conf/zoo.cfg
Mode: leader

下载kafka

在这里插入图片描述

下载地址为：

http://kafka.apache.org/downloads , 疯狂创客圈网盘也已经备好

建议下载1.1以前的版本，如果kafka_2.11-1.0.2，安装的时候问题比较少，然后将kafka 安装包上传到虚拟机

在这里插入图片描述

3 单节点安装

步骤3.2 - 解压tar文件

现在您已经在您的机器上下载了最新版本的Kafka, 使用以下命令提取tar文件, 也就是解压缩 -

$ cd /work/
$ tar -zxvf kafka_2.11-1.0.2.tgz
$ cd kafka_2.11-1.0.2
[root@localhost kafka_2.11-1.0.2]# ll
total 52
drwxr-xr-x 3 root root  4096 Apr  7  2020 bin
drwxr-xr-x 2 root root  4096 Apr  7  2020 config
drwxr-xr-x 2 root root  4096 Nov 23 22:23 libs
-rw-r--r-- 1 root root 32216 Apr  7  2020 LICENSE
-rw-r--r-- 1 root root   337 Apr  7  2020 NOTICE
drwxr-xr-x 2 root root  4096 Apr  7  2020 site-docs

步骤3.2 - 创建日志目录与环境变量

[root@localhost ~]#  cd /work/kafka_2.11-1.0.2/

[root@localhostkafka_2.11-1.0.2]#  mkdir -p logs/kafka1-logs

创建环境变量 vi /etc/profile

export KAFKA_HOME=/work/kafka_2.11-1.0.2

修改配置文件：

进入kafka的config目录下，有一个server.properties，主要修改的地方如下：

broker的全局唯一编号，不能重复
broker.id=1
监听
listeners=PLAINTEXT://192.168.233.128:9092

advertised.listeners=PLAINTEXT://192.168.233.128:9092

日志目录
log.dirs=/work/kafka_2.11-1.0.2/logs/kafka1-logs
配置zookeeper的连接（如果不是本机，需要该为ip或主机名）
zookeeper.connect=localhost:2181

vi /work/kafka_2.11-1.0.2/config/server.properties

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# see kafka.server.KafkaConfig for additional details and defaults

############################# Server Basics #############################

# The id of the broker. This must be set to a unique integer for each broker.
broker.id=1

############################# Socket Server Settings #############################

# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
#   FORMAT:
#     listeners = listener_name://host_name:port
#   EXAMPLE:
#     listeners = PLAINTEXT://your.host.name:9092
listeners=PLAINTEXT://192.168.233.128:9092


# Hostname and port the broker will advertise to producers and consumers. If not set,
# it uses the value for "listeners" if configured.  Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
#advertised.listeners=PLAINTEXT://your.host.name:9092

# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL

# The number of threads that the server uses for receiving requests from the network and sending responses to the network
num.network.threads=3

# The number of threads that the server uses for processing requests, which may include disk I/O
num.io.threads=8

# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400

# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400

# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600


############################# Log Basics #############################

# A comma separated list of directories under which to store log files
log.dirs=/work/kafka_2.11-1.0.2/logs/kafka1-logs

# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1

# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1

############################# Internal Topic Settings  #############################
# The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"
# For anything other than development testing, a value greater than 1 is recommended to ensure availability such as 3.
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1

############################# Log Flush Policy #############################

# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
#    1. Durability: Unflushed data may be lost if you are not using replication.
#    2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
#    3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.

# The number of messages to accept before forcing a flush of data to disk
#log.flush.int