日志实时分析系统(单机)
本次教程的主要目的是利用flume采集Apache的access.log数据通过kafka消息订阅服务转发至日志分析程序,其中需要的环境包括:linux系统,Java JDK,zookeeper,flume,kafka,接收数据的程序java程序。
一准备环境
软件下载
软件的下载
wget http://www.apache.org/dist/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
wget http://www.apache.org/dist/flume/1.9.0/apache-flume-1.9.0-bin.tar.gz
wget http://mirror.bit.edu.cn/apache/kafka/2.2.0/kafka_2.11-2.2.0.tgz
结果
-rw-r--r--. 1 root root 58688757 Oct 27 2018 apache-flume-1.8.0-bin.tar.gz
-rw-r--r--. 1 root root 286821827 Apr 4 14:13 flink-1.8.0-bin-scala_2.12.tgz
-rw-r--r--. 1 root root 63999924 Mar 23 08:57 kafka_2.11-2.2.0.tgz
-rw-r--r--. 1 root root 28678231 Mar 4 2016 scala-2.11.8.tgz
-rw-r--r--. 1 root root 37676320 Apr 1 22:44 zookeeper-3.4.14.tar.gz
环境安装
Java JDK 安装
yum -y remove java-1.8.0-openjdk*
安装结果:
java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
OpenJDK 64-Bit Server VM (build 25.212-b04, mixed mode)
安装zookeeper+flume+kafka
zookeeper安装配置
cd /usr/local
tar -zxvf zookeeper-3.4.14.tar.gz
mv ./zookeeper-3.4.14 ./zookeeper
vim ./zookeeper/conf/zoo.cfg
修改配置
tickTime=2000 initLimit=10
syncLimit=5
dataDir=/tmp/zookeeper
clientPort=2181
flume安装配置
cd /usr/local
tar -zxvf apache-flume-1.8.0-bin.tar.gz
mv ./apache-flume-1.8.0 ./flume
vim ./flume/conf/flume-conf.properties
# 配置修改
agent.sources = s1
agent.channels = c1
agent.sinks = k1
# 从指定文件读取数据
agent.sources.s1.type = exec
agent.sources.s1.command = tail -f /home/wwwlogs/access.log
agent.sources.s1.channels = c1
# 配置传输通道
agent.channels.c1.type = memory
agent.channels.c1.capacity = 10000
agent.channels.c1.transactionCapacity = 100
# 配置kafka接收数据
agent.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
agent.sinks.k1.brokerList = 10.1.1.35:9092
agent.sinks.k1.topic = test
agent.sinks.k1.serializer.class = kafka.serializer.StringEncoder
agent.sinks.k1.channel = c1
kafka安装配置
cd /usr/local
tar -zxvf kafka_2.11-2.2.0.tgz
mv ./kafka_2.11-2.2.0 ./kafka
vim ./kafka/conf/server.properties
修改配置
broker.id=1
advertised.listeners=PLAINTEXT://192.168.56.101:9092
zookeeper.connect=127.0.0.1:2181
程序启动
注意启动需要按照顺序来
- 启动zookeeper :/usr/local/zookeeper/bin/zkServer.sh start &
- 启动kafka