125.Spark大型电商项目-Spark Streaming+Kafka调试之Kafka errors NotLeaderForPartitionException, TimeoutExceptio

解决问题:Kafka errors NotLeaderForPartitionException, TimeoutException: Batch containing * record(s) expired

转载地址:https://blog.youkuaiyun.com/jiecxy/article/details/53369173

1. 错误描述

kafka Producer在运行期间,出现大量以下错误:

# 第一种
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.
# 第二种
org.apache.kafka.common.errors.TimeoutException: Batch containing * record(s) expired due to timeout while requesting metadata from brokers for test-*

注:在Kafka 0.10.0.1后,报错方式为:
Expiring * record(s) for test-* due to *
org.apache.kafka.common.errors.TimeoutException: Expiring * record(s) for test-0: 30941 ms has passed since last append

但是系统在一定时间后系统会恢复正常 
导致错误的原因有很多,笔者只记录自己所遇到的问题及解决方法 

2. 导致错误运行的脚本及系统状态

broker三节点:CPU:24Core 32G内存 1T硬盘
Kafka 版本: 0.10.0.1
Zookeeper 版本: 3.4.9 / 3.4.6
每个节点挂载盘数:1
Broker重要配置(均为默认值):
    num.network.threads=3
    num.io.threads=8

Topic:
bin/kafka-topics.sh --create --zookeeper hw168:2181 --replication-factor 3 --partitions 6 --topic test

同时在两个节点上运行测试脚本(对应参数如下):
bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic test --num-records 500000000 --record-size 100 --throughput -1 --producer-props bootstrap.servers=yourhostname:9092 buffer.memory=67108864 batch.size=8192 acks=1

3. 错误分析

3.1 org.apache.kafka.common.errors.NotLeaderForPartitionException

  • 分析 server.log 日志: 其中一台Broker会出现与Zookeeper的 session timed out 连接超时,接着导致 Controller挂掉重新选举 ,导致获取元数据不正确,挂掉的Broker所持有的partition就会出现NotLeaderForPartitionException,具体日志如下:

注:在后续的代码Debug中,发现Broker在与Zookeeper客户端与服务器中,网络IO的 select 严重超时(可达二十秒),导致进程在此“假死“,进而造成 session 超时,心跳包超时,引发了后续(因此在最后的解决方案中,增大 Session timeout 时间本质上治标不治本)。

[2016-12-01 15:35:47,946] INFO Client session timed out, have not heard from server in 7500ms for sessionid 0x58b937e13d0001, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2016-12-01 15:35:48,046] DEBUG Received event: WatchedEvent state:Disconnected type:None path:null (org.I0Itec.zkclient.ZkClient)[2016-12-01 15:35:48,047] INFO zookeeper state changed (Disconnected) (org.I0Itec.zkclient.ZkClient)
[2016-12-01 15:35:48,047] DEBUG Leaving process event (org.I0Itec.zkclient.ZkClient)
[2016-12-01 15:35:48,391] INFO Opening socket connection to server 172.18.11.169/172.18.11.169:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)[2016-12-01 15:35:48,391] INFO Socket connection established to 172.18.11.169/172.18.11.169:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2016-11-27 22:20:39,802] INFO Creating /controller (is it secure? false) (kafka.utils.ZKCheckedEphemeral)
[2016-11-27 22:20:39,806] INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
[2016-11-27 22:20:39,806] INFO 0 successfully elected as leader (kafka.server.ZookeeperLeaderElector)
[2016-11-27 22:20:05,918] INFO re-registering broker info in ZK for broker 2 (kafka.server.KafkaHealthcheck$SessionExpireListen
er)
[2016-11-27 22:20:05,919] INFO Creating /brokers/ids/2 (is it secure? false) (kafka.utils.ZKCheckedEphemeral)
[2016-11-27 22:20:05,972] INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
[2016-11-27 22:20:05,972] INFO Registered broker 2 at path /brokers/ids/2 with addresses: PLAINTEXT -> EndPoint(172.18.11.169,9
092,PLAINTEXT) (kafka.utils.ZkUtils)
[2016-11-27 22:20:05,973] INFO done re-registering broker (kafka.server.KafkaHealthcheck$SessionExpireListener)
[2016-11-27 22:20:05,973] INFO Subscribing to /brokers/topics path to watch for new topics (kafka.server.KafkaHealthcheck$Sessi
onExpireListener)
[2016-11-27 22:20:05,981] INFO New leader is 1 (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
  • 分析 Zookeeper 日志 
    发现在报错前,即系统”假死“后,Zookeeper 服务端由于超时关闭了与kafka的连接,因此,该Broker超时挂掉,但很快就重新连接上了,因此总体上系统又看似恢复了
2016-12-01 15:27:06,222 [myid:0] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client attempting to establish new session at /172.18.11.169:39879
2016-12-01 15:27:06,239 [myid:0] - INFO  [CommitProcessor:0:ZooKeeperServer@617] - Established session 0x58b937e13d0001 with negotiated timeout 6000 for client /172.18.11.169:39879
2016-12-01 15:35:48,586 [myid:0] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x58b937e13d0001, likely client has closed socket
    at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
    at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
    at java.lang.Thread.run(Thread.java:745)
2016-12-01 15:35:48,591 [myid:0] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /172.18.11.169:39879 which had sessionid 0x58b937e13d0001
  • 分析 kafkaServer-gc.log 日志

     没有出现 Full GC 状况,GC均为正常,因此排除GC原因 

 

  • 分析系统资源使用情况

每台Broker的网卡出入流量大概在80MB/s,内存被耗尽剩200多MB。而磁盘上,iowait时间在35%左右,可以达到40%,await 远大于 svctm,说明磁盘的IO压力挺大 

3.2 org.apache.kafka.common.errors.TimeoutException


分析 Producer源码 
可以从源码 Sender 的 run 方法 中看出,线程每次会查看RecordAccumulator中超时的Batch,具体报错地点在RecordBatch的maybeExpire方法中,其根本原因在于Sender线程不能在有效时间内将积存的消息发出去,导致超时。 
本实验中buffer.memory为64MB,远大于8KB的batch.size,根据topic,一个ProduceRequest有6个Batch,即48KB(不考虑Batch是否满),因此持续不断的数据请求BufferMemory可以积攒64*1024/48=1365个request,而若Broker处理不过来时(可能由于网络、磁盘瓶颈等原因),导致IO阻塞,缓存的数据不能及时发出,最终报错。

4. 错误解决方案

4.1 org.apache.kafka.common.errors.NotLeaderForPartitionException

  • 增大session的timeout时间 
    可有效缓解(但系统依旧”卡顿“,影响性能),将初始值更改为30s,在 server.properties 中:
zookeeper.session.timeout.ms=30000
  • 增加磁盘IO处理线程数(具体数值请根据实际情况) 
    在 server.properties 中:
num.io.threads=12
  • 增加 log 数据盘数(具体数值请根据实际情况) 
    在每台机器上多挂载几个盘,更改 kafka 的 server.properties 参数:
log.dirs=/mnt/datadir1/kafka-logs-0,/mnt/datadir2/kafka-logs-0,/mnt/datadir3/kafka-logs-0

4.2 org.apache.kafka.common.errors.TimeoutException

  • 增大 batch.size 或 减少 buffer.memory 
    更改Producer参数(具体数值请根据实际情况)
batch.size=65536

或者

buffer.memory=524288

注:简单做了下实验,在我的机器上,buffer.memory为batch.size的64倍左右性能最好,但是倍数一定,随着batch.size增大,时延也会相应增大,因此需要自己先测试一下选择可接受的时延大小。值得一提的是,buffer.memory过大(相比batch.size)会导致非常高的时延,batch.size也要比record.size大,才能批处理。另外,注意设置linger.ms,否则默认linger.ms=0每次发送的Batch是不会积满才发出去的,也就不能提高吞吐率。

  • 增加broker处理网络请求的线程数(具体数值请根据实际情况) 
    更改Broker的配置文件server.properties (具体数值请根据实际情况)
# The number of threads handling network requests, default = 3
num.network.threads=8

 

(base) root@hadoop01:~# spark-submit --driver-memory 512m --executor-memory 512m --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.4.0 /mycode04/text06/DJ30Analysis.py :: loading settings :: url = jar:file:/usr/local/spark/jars/ivy-2.5.1.jar!/org/apache/ivy/core/settings/ivysettings.xml Ivy Default Cache set to: /root/.ivy2/cache The jars for the packages stored in: /root/.ivy2/jars org.apache.spark#spark-sql-kafka-0-10_2.12 added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent-836ceabd-b936-474a-a47e-91a8dd2e2c46;1.0 confs: [default] found org.apache.spark#spark-sql-kafka-0-10_2.12;3.4.0 in central found org.apache.spark#spark-token-provider-kafka-0-10_2.12;3.4.0 in central found org.apache.kafka#kafka-clients;3.3.2 in central found org.lz4#lz4-java;1.8.0 in central found org.xerial.snappy#snappy-java;1.1.9.1 in central found org.slf4j#slf4j-api;2.0.6 in central found org.apache.hadoop#hadoop-client-runtime;3.3.4 in central found org.apache.hadoop#hadoop-client-api;3.3.4 in central found commons-logging#commons-logging;1.1.3 in central found com.google.code.findbugs#jsr305;3.0.0 in central found org.apache.commons#commons-pool2;2.11.1 in central :: resolution report :: resolve 495ms :: artifacts dl 14ms :: modules in use: com.google.code.findbugs#jsr305;3.0.0 from central in [default] commons-logging#commons-logging;1.1.3 from central in [default] org.apache.commons#commons-pool2;2.11.1 from central in [default] org.apache.hadoop#hadoop-client-api;3.3.4 from central in [default] org.apache.hadoop#hadoop-client-runtime;3.3.4 from central in [default] org.apache.kafka#kafka-clients;3.3.2 from central in [default] org.apache.spark#spark-sql-kafka-0-10_2.12;3.4.0 from central in [default] org.apache.spark#spark-token-provider-kafka-0-10_2.12;3.4.0 from central in [default] org.lz4#lz4-java;1.8.0 from central in [default] org.slf4j#slf4j-api;2.0.6 from central in [default] org.xerial.snappy#snappy-java;1.1.9.1 from central in [default] --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 11 | 0 | 0 | 0 || 11 | 0 | --------------------------------------------------------------------- :: retrieving :: org.apache.spark#spark-submit-parent-836ceabd-b936-474a-a47e-91a8dd2e2c46 confs: [default] 0 artifacts copied, 11 already retrieved (0kB/13ms) log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Traceback (most recent call last): File "/mycode04/text06/DJ30Analysis.py", line 40, in <module> query = signal_df.writeStream.outputMode("append").format("console").start().awaitTermination() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/spark/python/lib/pyspark.zip/pyspark/sql/streaming/readwriter.py", line 1385, in start File "/usr/local/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1322, in __call__ File "/usr/local/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py", line 175, in deco pyspark.errors.exceptions.captured.AnalysisException: Non-time-based windows are not supported on streaming DataFrames/Datasets; Window [avg(close#24) windowspecdefinition(date#23 ASC NULLS FIRST, specifiedwindowframe(RowFrame, -9, currentrow$())) AS MA_10#28], [date#23 ASC NULLS FIRST] +- Project [date#23, close#24] +- Project [substring(value#21, 0, 10) AS date#23, cast(substring(value#21, 12, 50) as double) AS close#24] +- Project [cast(value#8 as string) AS value#21] +- StreamingRelationV2 org.apache.spark.sql.kafka010.KafkaSourceProvider@433a115, kafka, org.apache.spark.sql.kafka010.KafkaSourceProvider$KafkaTable@7269b770, [kafka.bootstrap.servers=localhost:9092, subscribe=stock_prices], [key#7, value#8, topic#9, partition#10, offset#11L, timestamp#12, timestampType#13], StreamingRelation DataSource(org.apache.spark.sql.SparkSession@195dd325,kafka,List(),None,List(),None,Map(kafka.bootstrap.servers -> localhost:9092, subscribe -> stock_prices),None), kafka, [key#0, value#1, topic#2, partition#3, offset#4L, timestamp#5, timestampType#6]
最新发布
05-20
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值