filebeat收集各种类型日志写入logstash,再从logstash中读取日志写入kafka中(有filebeat)
如果对运维课程感兴趣,可以在b站上、A站或csdn上搜索我的账号: 运维实战课程,可以关注我,学习更多免费的运维实战技术视频
0.环境机器规划:
192.168.43.163 kafka1
192.168.43.164 kafka2
192.168.43.165 kafka3
192.168.43.166 filebeat-logstash
1.搭建zookeeper+kafka环境并测试创建数据存入kafka:(都是3台,每台上都有zookeeper和kafka服务,只是主机名叫kafka1/2/3,可以认为是:kafka将zookeeper做数据库了,且利用zookeeper的分布式存储且保持数据一致的特性)
3台机器:192.168.43.163 kafka1 192.168.43.164 kafka2 192.168.43.165 kafka3
1)搭建zookeeper+kafka环境:
192.168.43.163机器上:
[root@kafka1 ~]# yum install java-1.8.0-openjdk -y #安装jdk,zookeeper和kafka是基于java的,需要环境
[root@kafka1 ~]# java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
OpenJDK 64-Bit Server VM (build 25.212-b04, mixed mode)
安装zookeeper:
[root@kafka1 ~]# rz 上传zookeeper包
[root@kafka1 ~]# ls
zookeeper-3.4.11.tar.gz
[root@kafka1 ~]# tar -zxf zookeeper-3.4.11.tar.gz
[root@kafka1 ~]# ls zookeeper-3.4.11
bin dist-maven lib README_packaging.txt zookeeper-3.4.11.jar.asc
build.xml docs LICENSE.txt recipes zookeeper-3.4.11.jar.md5
conf ivysettings.xml NOTICE.txt src zookeeper-3.4.11.jar.sha1
contrib ivy.xml README.md zookeeper-3.4.11.jar
[root@kafka1 ~]# mv zookeeper-3.4.11 /usr/local/
[root@kafka1 ~]# ln -s /usr/local/zookeeper-3.4.11/ /usr/local/zookeeper
[root@kafka1 ~]# ls /usr/local/zookeeper
bin dist-maven lib README_packaging.txt zookeeper-3.4.11.jar.asc
build.xml docs LICENSE.txt recipes zookeeper-3.4.11.jar.md5
conf ivysettings.xml NOTICE.txt src zookeeper-3.4.11.jar.sha1
contrib ivy.xml README.md zookeeper-3.4.11.jar
[root@kafka1 ~]# mkdir /usr/local/zookeeper/data #创建zookeeper的数据目录,存放数据
[root@kafka1 ~]# cp /usr/local/zookeeper/conf/zoo_sample.cfg /usr/local/zookeeper/conf/zoo.cfg
[root@kafka1 ~]# vim /usr/local/zookeeper/conf/zoo.cfg #配置zookeeper,3个节点配置一样,自己自动选举
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/zookeeper/data
clientPort=2181
#下面是添加的: 服务器编号=服务器IP:LF数据同步端口:LF选举端口
server.1=192.168.43.163:2888:3888
server.2=192.168.43.164:2888:3888
server.3=192.168.43.165:2888:3888
#解释:
#tickTime: 每一次服务器之间单次心跳监测时间间隔,单位:毫秒
#initLimit: 集群中leader服务器与follower服务器第一次连接最多次数,超过该次数,认为follower机器挂了
#syncLimit: leader与follower之间发送和应答时间,如果该follower在设置的时间内不与leader进行通信,那么此follower被视为不可用
#dataDir:设置zookeeper的存放数据目录
#clientPort: 客户端连接zookeeper服务器的端口,zookeeper监听该端口,接收客户端请求
[root@kafka1 ~]# echo "1" > /usr/local/zookeeper/data/myid #配置zookeeper的id,3台机器不能一样
[root@kafka1 ~]# ls /usr/local/zookeeper/bin/ #有好几个启动脚本,根据需要使用
README.txt zkCleanup.sh zkCli.cmd zkCli.sh zkEnv.cmd zkEnv.sh zkServer.cmd zkServer.sh
[root@kafka1 ~]# /usr/local/zookeeper/bin/zkServer.sh start #启动zookeeper,会自动选举
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@kafka1 ~]# /usr/local/zookeeper/bin/zkServer.sh status #查看zookeeper状态和角色
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: follower #角色是自动选举的
安装kafka:
[root@kafka1 ~]# rz
上传kafka包
[root@kafka1 ~]# ls kafka_2.11-1.0.0.tgz
kafka_2.11-1.0.0.tgz
[root@kafka1 ~]# tar -zxf kafka_2.11-1.0.0.tgz
[root@kafka1 ~]# ls
kafka_2.11-1.0.0 kafka_2.11-1.0.0.tgz zookeeper-3.4.11.tar.gz zookeeper.out
[root@kafka1 ~]# mv kafka_2.11-1.0.0 /usr/local/
[root@kafka1 ~]# ln -s /usr/local/kafka_2.11-1.0.0/ /usr/local/kafka
[root@kafka1 ~]# ls /usr/local/kafka
bin config libs LICENSE NOTICE site-docs
[root@kafka1 ~]# vim /usr/local/kafka/config/server.properties #配置kafka
broker.id=1 #broker.id: 3台不能一样
listeners=PLAINTEXT://192.168.43.163:9092 #listeners: 监听地址修改本机的ip和端口,9092端口
advertised.listeners=PLAINTEXT://192.168.43.163:9092 #多种类型收集时,需要这项,否则多种类型写在同一文件中收集时会报错
delete.topic.enable=true #开启删除topic的开关
log.retention.hours=168
#默认保存在kafka里的数据时间,单位:小时, 168小时,即7天,可以根据需要设置,7天之后自动删除数据
zookeeper.connect=192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181
#zookeeper.connect: 配置zookeeper的连接地址,3台都配置
以守护进程方式启动kafka服务:
[root@kafka1 ~]# /usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties
[root@kafka1 ~]# ps -ef |grep kafka
有进程
[root@kafka1 ~]# tail /usr/local/kafka/logs/server.log
查看启动日志:状态是started,证明kafka已启动。
192.168.43.164机器上:
[root@kafka2 ~]# yum install java-1.8.0-openjdk -y #安装jdk,zookeeper和kafka是基于java的,需要环境
[root@kafka2 ~]# java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
OpenJDK 64-Bit Server VM (build 25.212-b04, mixed mode)
安装zookeeper:
[root@kafka2 ~]# rz 上传zookeeper包
[root@kafka2 ~]# ls
zookeeper-3.4.11.tar.gz
[root@kafka2 ~]# tar -zxf zookeeper-3.4.11.tar.gz
[root@kafka2 ~]# ls
zookeeper-3.4.11 zookeeper-3.4.11.tar.gz
[root@kafka2 ~]# ls zookeeper-3.4.11
bin dist-maven lib README_packaging.txt zookeeper-3.4.11.jar.asc
build.xml docs LICENSE.txt recipes zookeeper-3.4.11.jar.md5
conf ivysettings.xml NOTICE.txt src zookeeper-3.4.11.jar.sha1
contrib ivy.xml README.md zookeeper-3.4.11.jar
[root@kafka2 ~]# mv zookeeper-3.4.11 /usr/local/
[root@kafka2 ~]# ln -s /usr/local/zookeeper-3.4.11/ /usr/local/zookeeper
[root@kafka2 ~]# ls /usr/local/zookeeper
bin dist-maven lib README_packaging.txt zookeeper-3.4.11.jar.asc
build.xml docs LICENSE.txt recipes zookeeper-3.4.11.jar.md5
conf ivysettings.xml NOTICE.txt src zookeeper-3.4.11.jar.sha1
contrib ivy.xml README.md zookeeper-3.4.11.jar
[root@kafka2 ~]# mkdir /usr/local/zookeeper/data #创建zookeeper的数据目录,存放数据
[root@kafka2 ~]# cp /usr/local/zookeeper/conf/zoo_sample.cfg /usr/local/zookeeper/conf/zoo.cfg
[root@kafka2 ~]# vim /usr/local/zookeeper/conf/zoo.cfg #配置zookeeper,3个节点配置一样,自己自动选举
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/zookeeper/data
clientPort=2181
#下面是添加的: 服务器编号=服务器IP:LF数据同步端口:LF选举端口
server.1=192.168.43.163:2888:3888
server.2=192.168.43.164:2888:3888
server.3=192.168.43.165:2888:3888
#解释:
#tickTime: 每一次服务器之间单次心跳监测时间间隔,单位:毫秒
#initLimit: 集群中leader服务器与follower服务器第一次连接最多次数,超过该次数,认为follower机器挂了
#syncLimit: leader与follower之间发送和应答时间,如果该follower在设置的时间内不与leader进行通信,那么此follower被视为不可用
#dataDir:设置zookeeper的存放数据目录
#clientPort: 客户端连接zookeeper服务器的端口,zookeeper监听该端口,接收客户端请求
[root@kafka2 ~]# echo "2" > /usr/local/zookeeper/data/myid #配置zookeeper的id,3台机器不能一样
[root@kafka2 ~]# ls /usr/local/zookeeper/bin/ #有好几个启动脚本,根据需要使用
README.txt zkCleanup.sh zkCli.cmd zkCli.sh zkEnv.cmd zkEnv.sh zkServer.cmd zkServer.sh
[root@kafka2 ~]# /usr/local/zookeeper/bin/zkServer.sh start #启动zookeeper,会自动选举
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@kafka2 ~]# /usr/local/zookeeper/bin/zkServer.sh status #查看zookeeper状态和角色
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: leader #角色是自动选举的
安装kafka:
[root@kafka2 ~]# rz
上传kafka包
[root@kafka2 ~]# ls kafka_2.11-1.0.0.tgz
kafka_2.11-1.0.0.tgz
[root@kafka2 ~]# tar -zxf kafka_2.11-1.0.0.tgz
[root@kafka2 ~]# ls
kafka_2.11-1.0.0 kafka_2.11-1.0.0.tgz zookeeper-3.4.11.tar.gz zookeeper.out
[root@kafka2 ~]# mv kafka_2.11-1.0.0 /usr/local/
[root@kafka2 ~]# ln -s /usr/local/kafka_2.11-1.0.0/ /usr/local/kafka
[root@kafka2 ~]# ls /usr/local/kafka
bin config libs LICENSE NOTICE site-docs
[root@kafka2 ~]# vim /usr/local/kafka/config/server.properties #配置kafka
broker.id=2
#broker.id: 3台不能一样
listeners=PLAINTEXT://192.168.43.164:9092
advertised.listeners=PLAINTEXT://192.168.43.164:9092 #多种类型收集时,需要这项,否则多种类型写在同一文件中收集时会报错
delete.topic.enable=true #开启删除topic的开关
#listeners: 监听地址修改本机的ip
log.retention.hours=168
#默认保存在kafka里的数据时间,单位:小时, 168小时,即7天,可以根据需要设置,7天之后自动删除数据
zookeeper.connect=192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181
#zookeeper.connect: 配置zookeeper的连接地址,3台都配置
以守护进程方式启动kafka服务:
[root@kafka2 ~]# /usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties
[root@kafka2 ~]# ps -ef |grep kafka
有进程
[root@kafka2 ~]# tail /usr/local/kafka/logs/server.log
查看启动日志:状态是started,证明kafka已启动。
192.168.43.165机器上:
[root@kafka3 ~]# yum install java-1.8.0-openjdk -y #安装jdk,zookeeper和kafka是基于java的,需要环境
[root@kafka3 ~]# java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
OpenJDK 64-Bit Server VM (build 25.212-b04, mixed mode)
安装zookeeper:
[root@kafka3 ~]# rz 上传zookeeper包
[root@kafka3 ~]# ls
zookeeper-3.4.11.tar.gz
[root@kafka3 ~]# tar -zxf zookeeper-3.4.11.tar.gz
[root@kafka3 ~]# ls
zookeeper-3.4.11 zookeeper-3.4.11.tar.gz
[root@kafka3 ~]# ls zookeeper-3.4.11
bin dist-maven lib README_packaging.txt zookeeper-3.4.11.jar.asc
build.xml docs LICENSE.txt recipes zookeeper-3.4.11.jar.md5
conf ivysettings.xml NOTICE.txt src zookeeper-3.4.11.jar.sha1
contrib ivy.xml README.md zookeeper-3.4.11.jar
[root@kafka3 ~]# mv zookeeper-3.4.11 /usr/local/
[root@kafka3 ~]# ln -s /usr/local/zookeeper-3.4.11/ /usr/local/zookeeper
[root@kafka3 ~]# ls /usr/local/zookeeper
bin dist-maven lib README_packaging.txt zookeeper-3.4.11.jar.asc
build.xml docs LICENSE.txt recipes zookeeper-3.4.11.jar.md5
conf ivysettings.xml NOTICE.txt src zookeeper-3.4.11.jar.sha1
contrib ivy.xml README.md zookeeper-3.4.11.jar
[root@kafka3 ~]# mkdir /usr/local/zookeeper/data #创建zookeeper的数据目录,存放数据
[root@kafka3 ~]# cp /usr/local/zookeeper/conf/zoo_sample.cfg /usr/local/zookeeper/conf/zoo.cfg
[root@kafka3 ~]# vim /usr/local/zookeeper/conf/zoo.cfg #配置zookeeper,3个节点配置一样,自己自动选举
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/zookeeper/data
clientPort=2181
#下面是添加的: 服务器编号=服务器IP:LF数据同步端口:LF选举端口
server.1=192.168.43.163:2888:3888
server.2=192.168.43.164:2888:3888
server.3=192.168.43.165:2888:3888
#解释:
#tickTime: 每一次服务器之间单次心跳监测时间间隔,单位:毫秒
#initLimit: 集群中leader服务器与follower服务器第一次连接最多次数,超过该次数,认为follower机器挂了
#syncLimit: leader与follower之间发送和应答时间,如果该follower在设置的时间内不与leader进行通信,那么此follower被视为不可用
#dataDir:设置zookeeper的存放数据目录
#clientPort: 客户端连接zookeeper服务器的端口,zookeeper监听该端口,接收客户端请求
[root@kafka3 ~]# echo "3" > /usr/local/zookeeper/data/myid #配置zookeeper的id,3台机器不能一样
[root@kafka3 ~]# ls /usr/local/zookeeper/bin/ #有好几个启动脚本,根据需要使用
README.txt zkCleanup.sh zkCli.cmd zkCli.sh zkEnv.cmd zkEnv.sh zkServer.cmd zkServer.sh
[root@kafka3 ~]# /usr/local/zookeeper/bin/zkServer.sh start #启动zookeeper,会自动选举
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@kafka3 ~]# /usr/local/zookeeper/bin/zkServer.sh status #查看zookeeper状态和角色
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: follower #角色是自动选举的
安装kafka:
[root@kafka3 ~]# rz
上传kafka包
[root@kafka3 ~]# ls kafka_2.11-1.0.0.tgz
kafka_2.11-1.0.0.tgz
[root@kafka3 ~]# tar -zxf kafka_2.11-1.0.0.tgz
[root@kafka3 ~]# ls
kafka_2.11-1.0.0 kafka_2.11-1.0.0.tgz zookeeper-3.4.11.tar.gz zookeeper.out
[root@kafka3 ~]# mv kafka_2.11-1.0.0 /usr/local/
[root@kafka3 ~]# ln -s /usr/local/kafka_2.11-1.0.0/ /usr/local/kafka
[root@kafka3 ~]# ls /usr/local/kafka
bin config libs LICENSE NOTICE site-docs
[root@kafka3 ~]# vim /usr/local/kafka/config/server.properties #配置kafka
broker.id=3
#broker.id: 3台不能一样
listeners=PLAINTEXT://192.168.43.165:9092
advertised.listeners=PLAINTEXT://192.168.43.165:9092 #多种类型收集时,需要这项,否则多种类型写在同一文件中收集时会报错
delete.topic.enable=true #开启删除topic的开关
#listeners: 监听地址修改本机的ip
log.retention.hours=168
#默认保存在kafka里的数据时间,单位:小时, 168小时,即7天,可以根据需要设置,7天之后自动删除数据
zookeeper.connect=192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181
#zookeeper.connect: 配置zookeeper的连接地址,3台都配置
以守护进程方式启动kafka服务:
[root@kafka3 ~]# /usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties
[root@kafka3 ~]# ps -ef |grep kafka
有进程
[root@kafka3 ~]# tail /usr/local/kafka/logs/server.log
查看启动日志:状态是started,证明kafka已启动。
2)简单测试kafka中存入数据——以主题形式存入数据的(仅仅创建主题,不在主题里创建消息):
如:创建名为(即主题为):logstashtest和test两个主题,partitions(分区)为3,replication(复制)为3的topic主题
可以在任意一台kafka机器上进行测试:(因为是zookeeper机器,在任意一台机器上创建都会写在3台机器上)
比如:在kafka1机器上创建,在kafka2机器上获取.
192.168.43.163机器上:(在kafka1上创建主题: logstashtest,任意一台都能创建,3台机器都会同步)
[root@kafka1 ~]# /usr/local/kafka/bin/kafka-topics.sh --create --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 --partitions 3 --replication-factor 3 \
> --topic logstashtest #回车
Created topic "logstashtest". #显示created才可
[root@kafka1 ~]# /usr/local/kafka/bin/kafka-topics.sh --create --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 --partitions 3 --replication-factor 3 \
> --topic test #回车
Created topic "test". #显示created才可
192.168.43.164机器上:(在kafka2上获取主题测试,任意一台都能获取)
获取指定主题情况:logstashtest和test分别单独获取
[root@kafka2 ~]# /usr/local/kafka/bin/kafka-topics.sh --describe --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 --topic logstashtest #回车
Topic:logstashtest PartitionCount:3 ReplicationFactor:3 Configs:
Topic: logstashtest Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
Topic: logstashtest Partition: 1 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: logstashtest Partition: 2 Leader: 2 Replicas: 2,3,1 Isr: 2,3,1
获取所有主题情况:一次获取所有主题:
[root@kafka2 ~]# /usr/local/kafka/bin/kafka-topics.sh --list --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 #回车
logstashtest
test
192.168.43.163机器上:(在kafka1上删除主题: logstashtest,任意一台都能删除)
[root@kafka1 ~]# /usr/local/kafka/bin/kafka-topics.sh --delete --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 --topic logstashtest #回车
192.168.43.164机器上:(在kafka2上获取主题测试,任意一台都能获取)
[root@kafka2 ~]# /usr/local/kafka/bin/kafka-topics.sh --describe --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 --topic logstashtest #回车
为空,因为已删除
获取所有主题的命令:
[root@kafka2 ~]# /usr/local/kafka/bin/kafka-topics.sh --list --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181
test #只剩下test主题了
3)稍微复杂点测试kafka中存入数据—以主题形式存入数据的(创建主题且在主题里创建消息)
为了效果:在其中一台机器上创建主题,并在主题里创建(发送)消息(手动写入数据到kafka),在另一台上查看消息(从kafka中查看数据)
192.168.43.163机器上:(在kafka1上创建主题: messagetest,任意一台都能创建,3台机器都会同步)
创建主题:messagetest (任意一台机器创建即可)
[root@kafka1 ~]# /usr/local/kafka/bin/kafka-topics.sh --create --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 --partitions 3 --replication-factor 3 --topic messagetest #回车
Created topic "messagetest". #显示created才可。
在上面创建的主题中创建消息:(数据存入到kafka机器,任意一台机器创建即可) (创建消息,都是用9092端口)
[root@kafka1 ~]# /usr/local/kafka/bin/kafka-console-producer.sh --broker-list \
> 192.168.43.163:9092,192.168.43.164:9092,192.168.43.165:9092 --topic messagetest #回车
>hello1
>hello2
>hello3
>hello4
192.168.43.164机器上:查看主题和接收消息(查看主题和接收消息都是用:2181端口)
查看主题::
[root@kafka2 ~]# /usr/local/kafka/bin/kafka-topics.sh --describe --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 --topic messagetest
Topic:messagetest PartitionCount:3 ReplicationFactor:3 Configs:
Topic: messagetest Partition: 0 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: messagetest Partition: 1 Leader: 2 Replicas: 2,3,1 Isr: 2,3,1
Topic: messagetest Partition: 2 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
接收消息:(消费消息)
[root@kafka2 ~]# /usr/local/kafka/bin/kafka-console-consumer.sh --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 --topic messagetest \
> --from-beginning #回车,--from-beginning是从最开始接收消息
hello3
hello1
hello 4
hello2
2.安装logstash和filebeat,filebeat搜集本机各种类型日志写入本机logstash中,本机logstash收入数据后再写入到kafka集群中——本机机器:filebeat-logstash: 192.168.43.166
1)安装logstash6.2.4
[root@filebeat-logstash ~]# yum install java-1.8.0-openjdk -y #安装jdk,logstash需要java环境
[root@filebeat-logstash ~]# java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
OpenJDK 64-Bit Server VM (build 25.212-b04, mixed mode)
[root@filebeat-logstash ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.43.163 kafka1
192.168.43.164 kafka2
192.168.43.165 kafka3
192.168.43.166 filebeat-logstash
[root@filebeat-logstash ~]# rz
上传logstash包
[root@filebeat-logstash ~]# ls
logstash-6.2.4.tar.gz
[root@filebeat-logstash ~]# tar -zxf logstash-6.2.4.tar.gz
[root@filebeat-logstash ~]# mv logstash-6.2.4 /usr/local/
[root@filebeat-logstash ~]# ls /usr/local/logstash-6.2.4/
bin CONTRIBUTORS Gemfile lib logstash-core modules tools
config data Gemfile.lock LICENSE logstash-core-plugin-api NOTICE.TXT vendor
[root@filebeat-logstash ~]# export PATH=$PATH:/usr/local/logstash-6.2.4/bin/
[root@filebeat-logstash ~]# echo "PATH=$PATH:/usr/local/logstash-6.2.4/bin/" >> /etc/profile
[root@filebeat-logstash ~]# source /etc/profile
[root@filebeat-logstash ~]# logstash -V
logstash 6.2.4
[root@filebeat-logstash ~]# cat beats.conf logstash启动,等待filebeat写入日志,然后输出到kafka
input {
beats {
port => "5044"
codec => "json"
}
}
output {
if [type] == "system-log-166-filebeat" {
kafka {
bootstrap_servers => "192.168.43.163:9092,192.168.43.164:9092,192.168.43.165:9092"
topic_id => "system-log-filebeat-166"
codec => "json"
}}
if [type] == "nginx-accesslog-166-filebeat" {
kafka {
bootstrap_servers => "192.168.43.163:9092,192.168.43.164:9092,192.168.43.165:9092"
topic_id => "nginx-accesslog-filebeat-166"
codec => "json"
}}
if [type] == "es-log-166-filebeat" {
kafka {
bootstrap_servers => "192.168.43.163:9092,192.168.43.164:9092,192.168.43.165:9092"
topic_id => "es-log-filebeat-166"
codec => "json"
}}
}
#解释:
port => "5044" #从本机的5044端口接收数据,logstash默认端口:5044,若是其他logstash机器也是这么配置,因为,filebeat写入logstash时候已经指定的logstash机器的地址,logstash只需从用本地默认端口5044收即可
类型是类型,主题是主题,两个是独立的概念,不要混肴了。
从filebeat写入到logstash的是所有类型的,没法判断区分,只能在logstash中根据不同类型区分。
启动logstash,会监听5044端口,等待filebeat往里写,然后将收入的日志数据再写入kafka:
[root@filebeat-logstash ~]# logstash -f beats.conf
[root@filebeat-logstash ~]# netstat -anput |grep 5044
tcp6 0 0 :::5044 :::* LISTEN 17766/java
2)安装nginx,准备搜集nginx的访问日志:(nginx日志最好设置成json格式)
a)安装nginx:
[root@filebeat-logstash ~]# yum -y install gcc gcc-c++
[root@filebeat-logstash ~]# yum -y install openssl-devel openssl zlib zlib-devel pcre pcre-devel
[root@filebeat-logstash ~]# rz
上传nginx包
[root@filebeat-logstash ~]# ls nginx-1.6.2.tar.gz
nginx-1.6.2.tar.gz
[root@filebeat-logstash ~]# tar -zxf nginx-1.6.2.tar.gz
[root@filebeat-logstash ~]# cd nginx-1.6.2
[root@filebeat-logstash nginx-1.6.2]# ls
auto CHANGES CHANGES.ru conf configure contrib html LICENSE man README src
[root@filebeat-logstash nginx-1.6.2]# useradd -s /sbin/nologin -M nginx
[root@filebeat-logstash nginx-1.6.2]# ./configure --user=nginx --group=nginx --prefix=/usr/local/nginx --with-http_stub_status_module --with-http_ssl_module
[root@filebeat-logstash nginx-1.6.2]# make && make install
[root@filebeat-logstash nginx-1.6.2]# ls /usr/local/nginx/
conf html logs sbin
b)将nginx配置文件设置成json格式(key-value格式),容易让kibana搜集:
[root@filebeat-logstash nginx-1.6.2]# vim /usr/local/nginx/conf/nginx.conf
#添加下面内容:
log_format json '{"@timestamp":"$time_iso8601",'
'"@version":"1",'
'"client":"$remote_addr",'
'"url":"$uri",'
'"status":"$status",'
'"domain":"$host",'
'"host":"$server_addr",'
'"size":$body_bytes_sent,'
'"responsetime":$request_time,'
'"referer":"$http_referer",,'
'"ua":"$http_user_agent",'
'}';
access_log logs/access_json.log json;
wq
[root@filebeat-logstash nginx-1.6.2]# /usr/local/nginx/sbin/nginx -t
nginx: the configuration file /usr/local/nginx/conf/nginx.conf syntax is ok
nginx: configuration file /usr/local/nginx/conf/nginx.conf test is successful
[root@filebeat-logstash nginx-1.6.2]# /usr/local/nginx/sbin/nginx
[root@filebeat-logstash nginx-1.6.2]# netstat -anput |grep 80
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 5164/nginx: master
[root@filebeat-logstash nginx-1.6.2]# curl localhost #能访问nginx制造日志,访问两次
能访问nginx默认界面
[root@filebeat-logstash nginx-1.6.2]# ls /usr/local/nginx/logs/
access_json.log error.log nginx.pid
[root@filebeat-logstash nginx-1.6.2]# cat /usr/local/nginx/logs/access_json.log #json格式的日志,此日志会被搜集写入kafka
{"@timestamp":"2019-04-25T23:35:30+08:00","@version":"1","client":"127.0.0.1","url":"/index.html","status":"200","domain":"localhost","host":"127.0.0.1","size":612,"responsetime":0.000,"referer":"-",,"ua":"curl/7.29.0",}
{"@timestamp":"2019-04-25T23:35:32+08:00","@version":"1","client":"127.0.0.1","url":"/index.html","status":"200","domain":"localhost","host":"127.0.0.1","size":612,"responsetime":0.000,"referer":"-",,"ua":"curl/7.29.0",}
[root@filebeat-logstash nginx-1.6.2]# cd
3)安装filebeat
[root@filebeat-logstash ~]# rz
上传filebeat包
[root@filebeat-logstash ~]# ls
filebeat-5.6.8-x86_64.rpm logstash-6.2.4.tar.gz nginx-1.6.2 nginx-1.6.2.tar.gz
[root@filebeat-logstash ~]# yum -y install filebeat-5.6.8-x86_64.rpm
[root@filebeat-logstash ~]# vim /etc/filebeat/filebeat.yml #修改filebeat配置文件
#下面是修改—— ————系统日志类型
- input_type: log
paths:
- /var/log/*.log
- /var/log/messages #添加的
exclude_lines: ["^DBG"] #打开注释,排除的行
exclude_files: [".gz$"] #打开注释,排除的文件
document_type: "system-log-166-filebeat" #添加的,定义类型
#下面是又添加的一种收集类型,收集nginx的json日志——nginx的json类型
- input_type: log
paths:
- /usr/local/nginx/logs/access_json.log
exclude_lines: ["^DBG"] #排除的行
exclude_files: [".gz$"] #排除的文件
document_type: "nginx-accesslog-166-filebeat" #定义类型
#下面是又添加的一种收集类型,收集java类型的日志-elasticsearch的日志内容截取的部分—java类型
- input_type: log
paths:
- /opt/es-test.log.2019-04-19
exclude_lines: ["^DBG"] #排除的行
exclude_files: [".gz$"] #排除的文件
document_type: "es-log-166-filebeat" #定义类型
#下面将输出是elasticsearch的注释掉,默认输出是es
#output.elasticsearch:
# hosts: ["localhost:9200"]
#下面是把收集到的日志,写入到外部一个文件里,可以不用,此处是看效果,也可不用
ouput.file:
path: "/tmp"
filename: "filebeat.txt"
#下面是把收集的日志(所有类型都写入logstash,判断类型是在logstash里判断)写入logstash里,此处是本机的logstash里,尽量先启动logstash,再启动filebeat
output.logstash:
hosts: ["192.168.43.166:5044"] #logstash服务器地址,可以是多个,此处是本机的logstash
enabled: true #是否开启输出至logstash,默认即为true
worker: 1 #工作线程数
compression_level: 3 #压缩级别
#loadbalance: true #多个输出的时候开启负载,若是多个logstash可以打开这个
[root@filebeat-logstash ~]# systemctl start filebeat #启动filebeat,收集不同类型日志写入到logstash
[root@filebeat-logstash ~]# tailf /var/log/filebeat/filebeat
3.到kafka集群机器中查看:已经将数据写入kafka:
列出所有主题:
[root@kafka2 ~]# /usr/local/kafka/bin/kafka-topics.sh --list --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 #回车
es-log-filebeat-166
nginx-accesslog-filebeat-166
system-log-filebeat-166
查看指定主题:
[root@kafka2 ~]# /usr/local/kafka/bin/kafka-topics.sh --describe --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 --topic es-log-filebeat-166 #回车
Topic:es-log-filebeat-166 PartitionCount:1 ReplicationFactor:1 Configs:
Topic: es-log-filebeat-166 Partition: 0 Leader: 2 Replicas: 2 Isr: 2
[root@kafka2 ~]# /usr/local/kafka/bin/kafka-topics.sh --describe --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 --topic nginx-accesslog-filebeat-166 #回车
Topic:nginx-accesslog-filebeat-166 PartitionCount:1 ReplicationFactor:1 Configs:
Topic: nginx-accesslog-filebeat-166 Partition: 0 Leader: 1 Replicas: 1 Isr: 1
[root@kafka2 ~]# /usr/local/kafka/bin/kafka-topics.sh --describe --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 --topic system-log-filebeat-166 #回车
Topic:system-log-filebeat-166 PartitionCount:1 ReplicationFactor:1 Configs:
Topic: system-log-filebeat-166 Partition: 0 Leader: 2 Replicas: 2 Isr: 2
查看指定主题的日志数据(消费者消费消息):
[root@kafka2 ~]# /usr/local/kafka/bin/kafka-console-consumer.sh --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 --topic es-log-filebeat-166 --from-beginning #回车
能查看到写入es的日志,太多,说明一下即可。
[root@kafka2 ~]# /usr/local/kafka/bin/kafka-console-consumer.sh --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 --topic nginx-accesslog-filebeat-166 --from-beginning #回车
能查看到写入的nginx日志,太多,说明一下即可。
[root@kafka2 ~]# /usr/local/kafka/bin/kafka-console-consumer.sh --zookeeper \
> 192.168.43.163:2181,192.168.43.164:2181,192.168.43.165:2181 --topic system-log-filebeat-166 --from-beginning #回车
能查看到写入的系统日志,太多,说明一下即可。
如果对运维课程感兴趣,可以在b站上、A站或csdn上搜索我的账号: 运维实战课程,可以关注我,学习更多免费的运维实战技术视频