日志的流向比较灵活,已经测试过的包括:
filebeat —> logstash
filebeat —> elastic
filebeat —> redis
logstash <—> redis
logstash —> elastic
logstash –>redis –> logstash –> elastic
早期利用的环境,由于跨不同网络,故中间使用redis作为中转。
采集端 logstash配置
前端不处理,直接导入redis。正则匹配放在后端进行。
input {
file {
path => "/var/log/nginx/prod_access.log"
start_position => beginning
codec => plain {
charset => "UTF-8"
}
type => "prod_nginx"
}
}
output {
redis {
host => "ip_address" ##实际地址
port => 37000
data_type => "list"
key => "prod_nginx"
}
后台logstash 读取redis
input {
file {
path => "/data/path/to/log/prod_access.log"
start_position => beginning
codec => plain {
charset => "UTF-8"
}
type => "nginx"
}
}
filter {
grok {
patterns_dir => "/usr/local/logstash-5.1.1/nginx_pattern"
match => {
"message" => ["%{NGINXACCESS1}", "%{NGINXACCESS2}"]
}
overwrite => ["message"]
}
date {
match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ]
}
}
output {
elasticsearch {
hosts => "192.168.100.41:9200"
manage_template => false
# index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}" #这个配置有点问题
index => "nginx" ##这里全部都导入一个索引下,不好
document_type => "%{[@metadata][type]}"
}
}
logstash 进行日志解析的模板
CACHE_STAT \w+|-
RE_TIME %{NUMBER}|-
the_URI %{URI}|-
NGINXACCESS1 %{IP:client} - - \[%{HTTPDATE:localtime}\] \"%{WORD:method} %{URIPATHPARAM:uri_parm} HTTP/%{NUMBER:ver}\" %{NUMBER:status:int} %{NUMBER:body_bytes_sent:int} \"%{the_URI:referer}\" %{NUMBER:bytes_sent:int} %{NUMBER:request_length:int} \"%{GREEDYDATA:agent}\" \"-\" \"%{CACHE_STAT:cache_status}\" %{RE_TIME:request_time} %{RE_TIME:up_response_time}
NGINXACCESS2 %{IP:client} - - \[%{HTTPDATE:localtime}\] \"%{WORD:method} %{URIPATHPARAM:uri_parm} HTTP/%{NUMBER:ver}\" %{NUMBER:status:int} %{NUMBER:body_bytes_sent:int} \"%{URI:referer}\" %{NUMBER:bytes_sent:int} %{NUMBER:request_length:int} \"%{GREEDYDATA:agent}\" \"-\" \"%{CACHE_STAT:cache_status}\" %{RE_TIME:request_time} %{RE_TIME:up_response_time}
filebeat –> redis –> logstash –>elastic
前端采用了更为轻量的filebeat(还没感觉出来)直接读取保存为json格式的nginx日志,后台也不用再通过正则进行匹配。
filebeat
filebeat.prospectors:
- input_type: log
paths:
- /var/log/nginx/prod_access.json
document_type: "access_log"
fields:
log_source: "prod_nginx"
output.redis:
hosts: "redis_ip_address"
port: 37000
key: "%{[fields.log_source]}" #使用前面自定义的变量作为key
logstash
这里用了logstash两种方式,一种读取redis,写入本地文件。另一个则导入elasticsearch,供搜索。
input {
redis {
host => "192.168.*.*"
port => 7000
key => "prod_nginx"
data_type => "list"
}
redis {
host => "192.168.*.*"
port => 7000
key => "shop_tomcat_254"
type => "shop_tomcat"
codec => "json"
data_type => "list"
}
redis {
host => "192.168.*.*"
port => 7000
key => "shop_tomcat_151"
type => "shop_tomcat"
codec => "json"
data_type => "list"
}
redis {
host => "192.168.*.*"
port => 7000
key => "p2p_tomcat_254"
type => "p2p_tomcat"
codec => "json"
data_type => "list"
}
redis {
host => "192.168.*.*"
port => 7000
key => "p2p_tomcat_151"
type => "p2p_tomcat"
codec => "json"
data_type => "list"
}
redis {
host => "192.168.*.*"
port => 7000
key => "p2p_info_254"
type => "p2p_info"
codec => "json"
data_type => "list"
}
redis {
host => "192.168.*.*"
port => 7000
key => "p2p_info_151"
type => "p2p_info"
codec => "json"
data_type => "list"
}
redis {
host => "192.168.*.*"
port => 7000
key => "wx_tomcat_254"
type => "wx_tomcat"
codec => "json"
data_type => "list"
}
}
filter {
if [type] == "access_log" {
mutate {
gsub => ["message", "\\x", "\\\x"]
remove_field => ["beat"]
}
json {
source => "message"
}
date {
locale => "en"
match => ["localtime","dd/MMM/YYYY:HH:mm:ss Z"]
}
geoip {
source => "clientip"
}
}
}
output {
if [type] == "access_log" {
elasticsearch {
hosts => "elk.dev:9200"
index => "prod_nginx_%{+YYYYMMdd}"
##每日生成一个独立索引
}
file {
path => "/data/applogs/nginx_prod/prod_access.log"
codec => line { format => "%{message}"}
}
}
if [fields][log_source] == "shop_tomcat_254" {
file {
path => "/data/applogs/shop_prod/254_catalina.out"
codec => line { format => "%{message}"}
}
}
if [fields][log_source] == "shop_tomcat_151" {
file {
path => "/data/applogs/shop_prod/151_catalina.out"
codec => line { format => "%{message}"}
}
}
if [fields][log_source] == "p2p_tomcat_254" {
file {
path => "/data/applogs/p2p_prod/254_catalina.out"
codec => line { format => "%{message}"}
}
}
if [fields][log_source] == "p2p_tomcat_151" {
file {
path => "/data/applogs/p2p_prod/151_catalina.out"
codec => line { format => "%{message}"}
}
}
if [fields][log_source] == "p2p_info_254" {
file {
path => "/data/applogs/pm_prod/254_catalina.out"
codec => line { format => "%{message}"}
}
}
if [fields][log_source] == "p2p_info_151" {
file {
path => "/data/applogs/pm_prod/151_catalina.out"
codec => line { format => "%{message}"}
}
}
if [fields][log_source] == "wx_tomcat_254" {
file {
path => "/data/applogs/weixi/catalina.out"
codec => line { format => "%{message}"}
}
}
}
本文整理了两种日志处理流程,涉及filebeat、logstash和redis。第一种是logstash通过redis中转到elastic,第二种是filebeat直接到redis,再到logstash最后进入elastic。采集端logstash仅负责采集,正则匹配在后台logstash完成。filebeat作为轻量级替代,直接读取nginx日志存为json,避免了后端正则解析。
1011

被折叠的 条评论
为什么被折叠?



