Flink on k8s部署日志详解及与Yarn部署时的日志生成模式对比
最近需要将flink由原先部署到Yarn集群切换到kubernetes集群,在切换之后需要熟悉flink on k8s的运行模式。在使用过程中针对日志模块发现,在k8s的容器中,flink的系统日志只有jobmanager.log/taskmanager.log 两个,而当时在使用Yarn集群部署时,flink的日志会有多个,比如:jobmanager.log、jobmanager.err和jobmanager.out,TaskManager同理。
因此,有同事就提出为什么在k8s中部署时,只有.log一个文件,能不能类似Yarn部署时那样对日志文件进行区分。只是从容器日志来看的话,在一开始不够了解k8s的情况下,会觉得日志收集的不够准确。
因此针对上面的这个问题,就归我进行研究和解决了。网上的相关资料也比较少,因此,在本次对上面这个问题整体了解分析之后,进行一次学习记录。有遇到相关类似问题的,也可以参考这个思路。
一、认为需要修改log4j配置即可
拿到这个问题的第一步首先想到的是,既然要对日志的类别进行区分,则可以修改log4j的配置,将INFO类别和ERROR类别分别写入不同的日志文件即可。于是,先对flink路径下的conf/log4j-console.properties进行修改(flink on k8s部署时,使用的log4j配置文件是flink-console.properties文件,而不是log4j.properties)。
这里我们留下一个小疑问:为什么部署到k8s中时,使用的是log4j-console.properties,而不是部署到Yarn时的log4j.properties?有什么区别?
修改后的log4j-console.properties示例如下所示:
################################################################################
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################
# Allows this configuration to be modified at runtime. The file will be checked every 30 seconds.
monitorInterval=30
# This affects logging for both user code and Flink
rootLogger.level = INFO
rootLogger.appenderRef.console.ref = ConsoleAppender
rootLogger.appenderRef.rolling.ref = RollingFileAppender
rootLogger.appenderRef.errorLogFile.ref = errorLogFile
# Uncomment this if you want to _only_ change Flink's logging
#logger.flink.name = org.apache.flink
#logger.flink.level = INFO
# The following lines keep the log level of common libraries/connectors on
# log level INFO. The root logger does not override this. You have to manually
# change the log levels here.
logger.akka.name = akka
logger.akka.level = INFO
logger.kafka.name= org.apache.kafka
logger.kafka.level = INFO
logger.hadoop.name = org.apache.hadoop
logger.hadoop.level = INFO
logger.zookeeper.name = org.apache.zookeeper
logger.zookeeper.level = INFO
# Log all infos to the console
appender.console.name = ConsoleAppender
appender.console.type = CONSOLE
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n
# Log all infos in the given rolling file
appender.rolling.name = RollingFileAppender
appender.rolling.type = RollingFile
appender.rolling.append = true
appender.rolling.fileName = ${sys:log.file}
appender.rolling.filePattern = ${sys:log.file}.%i
appender.rolling.layout.type = PatternLayout
appender.rolling.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n
appender.rolling.policies.type = Policies
appender.rolling.policies.size.type = SizeBasedTriggeringPolicy
appender.rolling.poli

本文详细对比了Flink在Kubernetes和Yarn集群部署时的日志生成模式。起初尝试修改log4j配置区分日志未成功,后通过源码分析找到Yarn生成.err和.out日志的原因,发现K8s缺少重定向语句。设计解决方案后虽生成对应文件,但出现容器日志丢失问题,最终采用双写解决。
最低0.47元/天 解锁文章
836

被折叠的 条评论
为什么被折叠?



