kafka 是一个分布式的消息处理中间件
说在前面
- 工作环境:VMware® Workstation 12 Pro 12.5.6 build-5528349
- linux版本:CentOS-7-x86_64-Minimal-1611.iso
- JDK版本:jdk-8u65-linux-x64.tar.gz
- Hadoop版本:hadoop-2.7.6.tar.gz
- Zookeeper版本:zookeeper-3.4.12.tar.gz
- Kafka版本:kafka_2.11-2.0.0.tgz
kafka安装配置
-
解压到 /soft 目录下,并创建符号链接
//解压到 /soft 目录
$> tar -xzvf /mnt/hgfs/bigdata/soft/kafka_2.11-2.0.0.tgz -C /soft
//创建符号链接
$> ln -s /soft/kafka_2.11-2.0.0/ kafka
- 在 /etc/profile 文件中配置环境变量,source profile立即生效
# kafka
export KAFKA_HOME=/soft/kafka
export PATH=$PATH:$KAFKA_HOME/bin
- 配置kafka集群,主机:s202,s203,s204。在 /soft/kafka/config/server.properties 文件中进行如下修改。分发给其他主机(符号链接的分发:rsync -l kafka centosmin0@s204:/soft)
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=202
listeners=PLAINTEXT://:9092
# A comma separated list of directories under which to store log files
log.dirs=/home/centosmin0/kafka/logs
# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=s201:2181,s202:2181,s203:2181
-
启动kafka服务器
-
先启动ZooKeeper:zkServer.sh start
-
启动kafka集群:s202,s203,s204
$>bin/kafka-server-start.sh config/server.properties- 验证kafka集群服务器是否启动
$>netstat -anop | grep 9092
-
kafka的一些基本API使用
- 创建主题 test
$>bin/kafka-topics.sh --create --zookeeper s201:2181 --replication-factor 3 --partitions 3 --topic test
- 查看主题列表
$>bin/kafka-topics.sh --list --zookeeper s201:2181
- 启动控制台生产者
$>bin/kafka-console-producer.sh --broker-list s202:9092 --topic test
- 启动控制台消费者
$>bin/kafka-console-consumer.sh --bootstrap-server s202:9092 --topic test --from-beginning
Flume和kafka的集成
- KafkaSink:Flume作为生产者,Kafka作为消费者
a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.type=netcat
a1.sources.r1.bind=localhost
a1.sources.r1.port=8888
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.topic = test
a1.sinks.k1.kafka.bootstrap.servers = s202:9092
a1.sinks.k1.kafka.flumeBatchSize = 20
a1.sinks.k1.kafka.producer.acks = 1
a1.channels.c1.type=memory
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
- KafkaSource:Kafka作为生产者,Flume作为消费者
a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource
a1.sources.r1.batchSize = 5000
a1.sources.r1.batchDurationMillis = 2000
a1.sources.r1.kafka.bootstrap.servers = s202:9092
a1.sources.r1.kafka.topics = test
a1.sources.r1.kafka.consumer.group.id = g4
a1.sinks.k1.type = logger
a1.channels.c1.type=memory
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
- Channel:Flume做生产者和消费者
a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.type = avro
a1.sources.r1.bind = localhost
a1.sources.r1.port = 8888
a1.sinks.k1.type = logger
a1.channels.c1.type = org.apache.flume.channel.kafka.KafkaChannel
a1.channels.c1.kafka.bootstrap.servers = s202:9092
a1.channels.c1.kafka.topic = test
a1.channels.c1.kafka.consumer.group.id = g6
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

本文详细介绍了如何在CentOS环境下搭建Kafka集群,包括软件版本选择、配置文件修改及环境变量设置等步骤。同时,深入探讨了Kafka与Flume的集成方式,通过具体示例展示了如何配置Flume作为Kafka的生产者和消费者,实现数据的高效流转。
1791

被折叠的 条评论
为什么被折叠?



