问题来源
同事在一次spark程序中,通过spark在kafka中的topic中写入数据发生报错,错误日志如下:
2021-01-25 19:58:11,794 INFO utils.AppInfoParser: Kafka version : 2.0.0-cdh6.1.1
2021-01-25 19:58:11,794 INFO utils.AppInfoParser: Kafka commitId : null
2021-01-25 19:58:11,984 INFO codegen.CodeGenerator: Code generated in 80.049669 ms
2021-01-25 19:58:12,020 WARN clients.NetworkClient: [Producer clientId=producer-1] Connection to node -3 could not be established. Broker may not be available.
2021-01-25 19:58:12,034 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 4 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:12,034 INFO clients.Metadata: Cluster ID: aKKlHlDqQtalfjbLYRW1GQ
2021-01-25 19:58:12,136 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 9 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:12,241 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 10 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:12,346 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 11 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:12,451 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 12 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:12,554 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 13 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:12,658 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 14 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:12,763 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 15 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:12,867 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 16 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:12,971 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 17 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:13,076 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 18 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:13,181 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 19 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:13,285 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 20 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:13,390 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 21 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:13,495 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 22 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:13,600 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 23 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:13,704 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 24 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:13,808 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 25 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:13,913 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 26 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:14,044 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 27 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:14,149 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 28 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
2021-01-25 19:58:14,254 WARN clients.NetworkClient: [Producer clientId=producer-1] Error while fetching metadata with correlation id 29 : {tagging-api-1611575841019=INVALID_REPLICATION_FACTOR}
探索步骤:
- 根据日志中的信息:
INVALID_REPLICATION_FACTOR初步判断,出现了不合法的副本数。 - 检查
kafka的配置文件发现server.properties中默认的配置是三副本配置。 - 检查被写入
kafka的topic是否存在,发现目标topic并不存在,初步怀疑是建立topic失败 - 得知同事以前可以在这个
kafka的集群中成功执行同样的任务,遂后开始检查kafka集群的状态。 - 检查出有一台的
kafka的broker结点挂掉,重启失败。发现是磁盘空间被占空导致重启失败,遂清理无用日志空出空间,启动kakfa发现kafka成功执行没有失败 - 同事再次启动程序,这个时候已经可以成功运行了。问题解决√
总结
在水下这篇文章之前,看了下报错日志。其中日志里提到了这样的一句话:Connection to node -3 could not be established. Broker may not be available.,所以,emmmm,如果仔细一下日志的话,可以更加快速的定位到问题。所以,我再次记录一下这个文章的目的是为了警醒自己,一定要仔细看日志,仔细看日志。

本文记录了一次由于Kafka副本数配置及节点问题导致的Spark程序报错情况。同事在尝试通过Spark写入Kafka时,遇到'Error while fetching metadata with correlation id'错误,日志提示副本数不合法。检查发现Kafka配置为三副本,但目标主题不存在,且集群中一台Broker节点因磁盘空间不足挂载失败。清理磁盘后重启节点,问题得到解决。作者强调了仔细阅读日志对于快速定位问题的重要性。
3635

被折叠的 条评论
为什么被折叠?



