ElasticSearch基础部分
1. 通过Filebeat把日志传入到Elasticsearch
1.1. 测试数据apache-daily-access.log
获取;提取码nin8
1.2. 配置文件filebeat.yml
filebeat.inputs:
- type: log
enabled: true
fields:
apache: true
tags: ["xiaofan-my-service", "xiaofan-hardware", "xiaofan-test"]
paths:
- /home/hadoop/fanjh/data/logstash_data/*.log
output.elasticsearch:
# elasticsearch endpoint 或者kibana endpoint都可以
hosts: ["https://474bc7dd2b344246875440939836bc31.asia-northeast1.gcp.cloud.es.io:9243"]
# 删除指定字段
processors:
- drop_fields:
fields: ["ecs"]
cloud.id: "XIAOFAN_ES:YXNpYS1ub3J0aGVhc3QxLmdjcC5jbG91ZC5lcy5pbyQyYWI5MGE2YzU1ZTE0MDRlYWQ4ZTcyNmY4ZThkM2M5YyQ0NzRiYzdkZ
DJiMzQ0MjQ2ODc1NDQwOTM5ODM2YmMzMQ=="
cloud.auth: "elastic:AGHUf1oLKu7xvBvBlOXJJhoi"
1.3. 测试结果
2. 通过logstash把Apache日志导入到Elasticsearch
2.1. 配置文件filebeat.yml
filebeat.inputs:
- type: log
enabled: true
fields:
apache: true
tags: ["xiaofan-my-service", "xiaofan-hardware", "xiaofan-test"]
paths:
- /home/hadoop/fanjh/data/logstash_data/*.log
output.logstash:
# The Logstash hosts
hosts: ["localhost:5044"]
- 测试:
./filebeat test output
2.2. 配置文件logstash.yml
input {
beats {
port => "5044"
}
}
filter {
grok {
match => {
"message" => '%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int}) %{QS:referrer} %{QS:agent}'
}
}
date {
match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ]
locale => en
}
geoip {
source => "clientip"
}
useragent {
source => "agent"
target => "useragent"
}
}
output {
stdout {
codec => dots {}
}
elasticsearch {
hosts => [ "https://2ab90a6c55e1404ead8e726f8e8d3c9c.asia-northeast1.gcp.cloud.es.io:9243" ]
index => "apache_elastic_example"
user => "elastic"
password => "AGHUf1oLKu7xvBvBlOXJJhoi"
}
}
2.3. 启动filebeat和logstash
./logstash -f /home/hadoop/fanjh/data/conf/apache_logstash.conf
rm -rf data/*
./filebeat
2.4. 注意事项
- grok 这个通过正则表达式来匹配我们的每一条log信息,把我们的每条消息通过正则表达式进行匹配的方法来进行结构化的处理,把相应的值赋值给相应的变量(
通过Grok这个filter,它通过正则表达式进行匹配,并把我们的输入的非结构化的数据变为一个结构化的数据
) date filter:
通过date filter的运用把上一个filter传过来的timestamp信息转化为一个@timestamp的字段,其中@timestamp表示的在运行时的当前timestamp。我们希望@timestamp来自于log里的时间信息,也就是timestamp所表述的时间。geoip filter:
geoip filter可以根据我们的IP地址帮我们解析是来自哪一个地方的以及它的经纬度等等信息useragent filter:
useragent可以帮我们添加有关useragent(如系列,操作系统,版本和设备)的信息
3. 使用Kafka部署Elastic Stack最常用的一种部署方式
- 日志是不可预测的。 在发生生产事件后,恰恰在您最需要它们时,日志可能突然激增并淹没您的日志记录基础结构。 为了保护Logstash和Elasticsearch免受此类数据突发攻击,用户部署了缓冲机制以充当消息代理。
4. 运用 Elastic Stack 分析 Spring boot 微服务日志
4.1. 创建 Spring boot 应用【Github链接】
- 打成jar包,部署到linux上运行
4.2. 配置文件
-
filebeat_logstash.yml
filebeat.inputs: - type: log enabled: true paths: - /home/hadoop/fanjh/data/logstash_data/spring-boot-elastic.log multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}' multiline.negate: true multiline.match: after output.logstash: hosts: ["localhost:5044"]
-
logstash.conf
# Read input from filebeat by listening to port 5044 on which filebeat will send the data input { beats { type => "test" port => "5044" } } filter { #If log line contains tab character followed by 'at' then we will tag that entry as stacktrace if [message] =~ "\tat" { grok { match => ["message", "^(\tat)"] add_tag => ["stacktrace"] } } } output { stdout { codec => rubydebug } # Sending properly parsed log events to elasticsearch elasticsearch { hosts => ["https://ab680dbcf3fa41d8b87e2d1e549bec77.asia-northeast1.gcp.cloud.es.io:9243"] user => "elastic" password => "cxYiWW4vFEE4nuubo8TZVyrY" } }
4.3. 测试
- 测试filebeat配置
./filebeat -c filebeat_logstash.yml test config
- 测试filebeat输出
./filebeat -c filebeat_logstash.yml test output
- 启动logstash
bin/logstash -f config/logstash.conf
- 再次检测filebeat输出
- 启动filebeat
./filebeat -e -c filebeat_logstash.yml
4.4. 使用Filebeat传送多行日志
4.5 打tag
5. 参考链接
- Beats:通过Filebeat把日志传入到Elasticsearch
- Logstash:把Apache日志导入到Elasticsearch
- 使用Kafka部署Elastic Stack
- Elastic:运用 Elastic Stack 分析 Spring boot 微服务日志