logstash

Logstash配置与安装:从基础到高级实践

安装

sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

vi /etc/yum.repos.d/logstash.repo

[logstash-8.x]
name=Elastic repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

 sudo yum install logstash

安装拓扑

logstash.service   :   /etc/systemd/system/logstash.service

home:   /usr/share/logstash

bin:         /usr/share/logstash/bin

config:    /etc/logstash

log:         /var/log/logstash/

plugins:  /usr/share/logstash/plugins

data:      /var/lib/logstash(包含 .lock,)

配置

pipeline configuration files        which define the Logstash processing pipeline

/etc/logstash/conf.d    

 settings files                             which specify options that control Logstash startup and execution

/etc/logstash/logstash.yml        

/etc/logstash/pipelines.yml  Contains the framework and instructions for running multiple pipelines in a single Logstash instance.

/etc/logstash/jvm.options    Contains JVM configuration flags. Use this file to set initial and maximum values for total heap space.

/etc/logstash/startup.options 

测试安装

第一个pipeline

cd /usr/share/logstash/

bin/logstash -e 'input {stdin{}} output{stdout{}}'          会打开一个shell,等待一会会出现The stdin plugin is now waiting for input:

输入hello world 回车,会打印

{
"@version" => "1",
"message" => "hello world",
"@timestamp" => 2022-04-26T09:18:26.485741Z,
"event" => {
"original" => "hello world"
},
"host" => {
"hostname" => "10-52-6-111"
}
}

安装成功。

 ----------------------------------------------------------------------------------------------------

pipeline 配置

pwd
/etc/logstash/conf.d

最终可用配置:

相关logstash,kafka,es版本   

Using bundled JDK: /usr/share/logstash/jdk   logstash 8.1.3

es    7.16.3

kafka  2.7.1

cat /etc/logstash/conf.d/kafka_to_es.conf
input{

kafka {
bootstrap_servers => "106:9093,10.1:9093,10:9093"
topics => "vslogs"
group_id => "vslogs_group_id_1"
client_id => "vslogs_client_id_1"
auto_offset_reset => "latest"
consumer_threads => 3
decorate_events => true
type => "vslogs"
codec => "json"
sasl_mechanism => "SCRAM-SHA-256"
security_protocol => "SASL_PLAINTEXT"
sasl_jaas_config => "org.apache.kafka.common.security.scram.ScramLoginModule required username='' password='';"

}

kafka {
bootstrap_servers => "6:9093,2.31:9093,1.4.0.112:9093"
topics => "vsulblog"
group_id => "vsulblog_group_id_1"
client_id => "vsulblog_client_id_1"
auto_offset_reset => "latest"
consumer_threads => 3
decorate_events => true
type => "vsulblog"
codec => "json"
sasl_mechanism => "SCRAM-SHA-256"
security_protocol => "SASL_PLAINTEXT"
sasl_jaas_config => "org.apache.kafka.common.security.scram.ScramLoginModule required username='' password='';"

}
}

filter{}

output {
if ([type]=="vslogs" ) {
elasticsearch{
hosts => [ ":9200", ":9200", ":9200" ]
index => "vs-vslogs"
user => ""
password => ""
}
}
if ([type]=="vsulblog" ) {
elasticsearch{
hosts => [ ":9200", ":9200", "1.4.1.9:9200" ]
index => "vs-vsulblog"
user => ""
password => ""
}
}
}

配置文件注意情况:

1,当input里面有多个kafka输入源时,client_id => "client1",
client_id => "client2"
必须添加且需要不同

In cases when multiple inputs are being used in a single pipeline, reading from different topics,
it’s essential to set a different group_id => ... for each input. Setting a unique client_id => ... is also recommended.

2,
  topics  => "accesslogs"  -- 旧版本的logstash需要使用参数:topic_id
        bootstrap_servers => "JANSON01:9092,JANSON02:9092,JANSON03:9092" -- 旧版本的logstash需要使用参数:zk_connect=>"JANSON01:2181,xx"



从文件读,写入kafka配置:

input {
file {
codec => plain {
charset => "UTF-8"
}
path => "/root/logserver/gamelog.txt" //tmp/log/* 路径下所有
discover_interval => 5
start_position => "beginning"
}
}

output {
kafka {
topic_id => "gamelogs"
codec => plain {
format => "%{message}"
charset => "UTF-8"
}
bootstrap_servers => "node01:9092,node02:9092,node03:9092"
}
}

bin使用:

查看已安装的插件列表

/usr/share/logstash/bin/logstash-plugin list

 检查配置是否书写正确

/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/kafka_to_es.conf --config.test_and_exit

做成服务运行

systemctl cat logstash
# /etc/systemd/system/logstash.service
[Unit]
Description=logstash

[Service]
Type=simple
User=logstash
Group=logstash
# Load env vars from /etc/default/ and /etc/sysconfig/ if they exist.
# Prefixing the path with '-' makes it try to load, but if the file doesn't
# exist, it continues onward.
EnvironmentFile=-/etc/default/logstash
EnvironmentFile=-/etc/sysconfig/logstash
ExecStart=/usr/share/logstash/bin/logstash "--path.settings" "/etc/logstash"
Restart=always
WorkingDirectory=/
Nice=19
LimitNOFILE=16384

# When stopping, how long to wait before giving up and sending SIGKILL?
# Keep in mind that SIGKILL on a process can cause data loss.
TimeoutStopSec=infinity

[Install]
WantedBy=multi-user.target

参考连接:

https://www.elastic.co/guide/en/logstash/current/index.html

https://cloud.tencent.com/developer/article/1353068?from=10680

https://www.cnblogs.com/lshan/p/14121342.html

### Logstash 使用指南与常见问题解决 Logstash 是 Elastic Stack 中的一个重要组成部分,主要用于数据采集、转换和传输。它具有高度可扩展性和灵活性,适用于多种应用场景,如实时日志分析、安全监控、性能监测等[^1]。 #### 1. **Logstash 的基本架构** Logstash 的工作流程分为三个阶段:输入(Input)、过滤(Filter)和输出(Output)。每个阶段都可以通过插件进行定制化配置。 - 输入阶段负责接收数据源,例如文件、syslog、TCP/UDP 流等。 - 过滤阶段对数据进行解析、修改或 enrich 处理。 - 输出阶段将处理后的数据发送到目标位置,例如 ElasticsearchKafka 或其他存储系统。 #### 2. **Logstash 配置文件结构** Logstash 的配置文件通常由三部分组成:input、filter 和 output。下面是一个简单的例子: ```plaintext input { file { path => "/var/log/*.log" start_position => "beginning" } } filter { grok { match => { "message" => "%{COMBINED_LOG_FORMAT}" } } } output { elasticsearch { hosts => ["http://localhost:9200"] index => "logs-%{+YYYY.MM.dd}" } } ``` 此配置实现了从 `/var/log/` 路径下的日志文件读取数据,并使用 Grok 插件解析日志内容,最后将其索引到 Elasticsearch 中。 #### 3. **常见问题及其解决方法** ##### (1)**Logstash 启动失败** 如果 Logstash 在启动时遇到错误,可能是由于内存不足或者配置文件语法有误引起的。可以通过增加 JVM 堆大小来缓解内存压力,编辑 `jvm.options` 文件调整 `-Xms` 和 `-Xmx` 参数。另外,确保配置文件无拼写错误并通过命令行测试其有效性: ```bash bin/logstash -f /path/to/config.conf --config.test_and_exit ``` ##### (2)**日志重复摄入** 有时可能会发现同一份日志被多次摄入,这通常是因 Sincedb 文件损坏所致。Sincedb 记录了上次消费的位置以便后续增量读取。尝试删除 sincedb 文件重新开始同步即可解决问题: ```bash rm .sincedb_* ``` ##### (3)**Grok 解析失败** 当 Grok 表达式无法匹配预期模式时,可能导致字段提取不成功。建议利用调试工具验证正则表达式的准确性,例如在线 Grok Debugger 工具[^2]。 #### 4. **优化技巧** 为了提升 Logstash 的效率,可以从以下几个方面入手: - 减少不必要的 filter 步骤以降低 CPU 占用率。 - 批量提交数据至下游服务减少网络开销。 - 定期清理旧的日志文件防止磁盘空间耗尽。 --- ### 总结 通过对 Logstash 的深入理解及合理配置,可以有效应对各类复杂场景下的数据流管理挑战。无论是基础入门还是高级调优,掌握好核心概念和技术细节都是至关重要的。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值