Flink on k8s容器日志生成原理及与Yarn部署时的日志生成模式对比

本文详细对比了Flink在Kubernetes和Yarn集群部署时的日志生成模式。起初尝试修改log4j配置区分日志未成功,后通过源码分析找到Yarn生成.err和.out日志的原因,发现K8s缺少重定向语句。设计解决方案后虽生成对应文件,但出现容器日志丢失问题,最终采用双写解决。

Flink on k8s部署日志详解及与Yarn部署时的日志生成模式对比

最近需要将flink由原先部署到Yarn集群切换到kubernetes集群,在切换之后需要熟悉flink on k8s的运行模式。在使用过程中针对日志模块发现,在k8s的容器中,flink的系统日志只有jobmanager.log/taskmanager.log 两个,而当时在使用Yarn集群部署时,flink的日志会有多个,比如:jobmanager.log、jobmanager.err和jobmanager.out,TaskManager同理。

因此,有同事就提出为什么在k8s中部署时,只有.log一个文件,能不能类似Yarn部署时那样对日志文件进行区分。只是从容器日志来看的话,在一开始不够了解k8s的情况下,会觉得日志收集的不够准确。

因此针对上面的这个问题,就归我进行研究和解决了。网上的相关资料也比较少,因此,在本次对上面这个问题整体了解分析之后,进行一次学习记录。有遇到相关类似问题的,也可以参考这个思路。

一、认为需要修改log4j配置即可

拿到这个问题的第一步首先想到的是,既然要对日志的类别进行区分,则可以修改log4j的配置,将INFO类别和ERROR类别分别写入不同的日志文件即可。于是,先对flink路径下的conf/log4j-console.properties进行修改(flink on k8s部署时,使用的log4j配置文件是flink-console.properties文件,而不是log4j.properties)。

这里我们留下一个小疑问:为什么部署到k8s中时,使用的是log4j-console.properties,而不是部署到Yarn时的log4j.properties?有什么区别?

修改后的log4j-console.properties示例如下所示:

################################################################################
#  Licensed to the Apache Software Foundation (ASF) under one
#  or more contributor license agreements.  See the NOTICE file
#  distributed with this work for additional information
#  regarding copyright ownership.  The ASF licenses this file
#  to you under the Apache License, Version 2.0 (the
#  "License"); you may not use this file except in compliance
#  with the License.  You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
#  Unless required by applicable law or agreed to in writing, software
#  distributed under the License is distributed on an "AS IS" BASIS,
#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#  See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

# Allows this configuration to be modified at runtime. The file will be checked every 30 seconds.
monitorInterval=30

# This affects logging for both user code and Flink
rootLogger.level = INFO
rootLogger.appenderRef.console.ref = ConsoleAppender
rootLogger.appenderRef.rolling.ref = RollingFileAppender
rootLogger.appenderRef.errorLogFile.ref = errorLogFile

# Uncomment this if you want to _only_ change Flink's logging
#logger.flink.name = org.apache.flink
#logger.flink.level = INFO

# The following lines keep the log level of common libraries/connectors on
# log level INFO. The root logger does not override this. You have to manually
# change the log levels here.
logger.akka.name = akka
logger.akka.level = INFO
logger.kafka.name= org.apache.kafka
logger.kafka.level = INFO
logger.hadoop.name = org.apache.hadoop
logger.hadoop.level = INFO
logger.zookeeper.name = org.apache.zookeeper
logger.zookeeper.level = INFO

# Log all infos to the console
appender.console.name = ConsoleAppender
appender.console.type = CONSOLE
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n

# Log all infos in the given rolling file
appender.rolling.name = RollingFileAppender
appender.rolling.type = RollingFile
appender.rolling.append = true
appender.rolling.fileName = ${sys:log.file}
appender.rolling.filePattern = ${sys:log.file}.%i
appender.rolling.layout.type = PatternLayout
appender.rolling.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n
appender.rolling.policies.type = Policies
appender.rolling.policies.size.type = SizeBasedTriggeringPolicy
appender.rolling.poli
参考引用未提及Flink on Kubernetes部署文件中日志部分的配置信息。但已知可以通过使用Fluentd或Loki进行日志收集分析,以增强运维效率 [^2]。一般而言,在Flink on Kubernetes部署文件中配置日志部分,可能会涉及到以下几个方面: ### Fluentd配置示例 若使用Fluentd进行日志收集,可能在Kubernetes的Deployment或DaemonSet配置文件中添加如下内容: ```yaml apiVersion: apps/v1 kind: DaemonSet metadata: name: fluentd namespace: kube-system labels: k8s-app: fluentd-logging spec: selector: matchLabels: name: fluentd template: metadata: labels: name: fluentd spec: tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule containers: - name: fluentd image: fluent/fluentd-kubernetes-daemonset:v1.14-debian-elasticsearch env: - name: FLUENT_ELASTICSEARCH_HOST value: "elasticsearch" - name: FLUENT_ELASTICSEARCH_PORT value: "9200" - name: FLUENT_ELASTICSEARCH_SCHEME value: "http" - name: FLUENTD_SYSTEMD_CONF value: "disable" volumeMounts: - name: varlog mountPath: /var/log - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true terminationGracePeriodSeconds: 30 volumes: - name: varlog hostPath: path: /var/log - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers ``` ### Loki配置示例 若使用Loki进行日志收集,在Promtail(Loki的客户端)的Deployment配置文件中可能会有如下配置: ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: promtail namespace: monitoring spec: replicas: 1 selector: matchLabels: app: promtail template: metadata: labels: app: promtail spec: containers: - name: promtail image: grafana/promtail:2.5.0 args: - -config.file=/etc/promtail/promtail-config.yaml volumeMounts: - name: config-volume mountPath: /etc/promtail - name: run mountPath: /run/promtail - name: varlog mountPath: /var/log volumes: - name: config-volume configMap: name: promtail-config - name: run hostPath: path: /run/promtail - name: varlog hostPath: path: /var/log ```
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

JermeryBesian

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值