zookeeper 重启报错
一:现象:zookeeper重启报错
提示:以下是本篇文章正文内容,下面案例可供参考
查看日志:
二、处理方法
1.多次重启,无效
[root@sparkworker1 version-2]# ll
total 462196
-rw-r–r-- 1 cloudera-scm cloudera-scm 3 May 27 15:48 acceptedEpoch
-rw-r–r-- 1 cloudera-scm cloudera-scm 3 May 27 15:48 currentEpoch
-rw-r–r-- 1 cloudera-scm cloudera-scm 67108880 May 27 15:43 log.6100000001
-rw-r–r-- 1 cloudera-scm cloudera-scm 67108880 May 27 15:43 log.6200000001
-rw-r–r-- 1 cloudera-scm cloudera-scm 67108880 May 27 15:44 log.6400000001
-rw-r–r-- 1 cloudera-scm cloudera-scm 67108880 May 27 15:51 log.6900000001
-rw-r–r-- 1 cloudera-scm cloudera-scm 236571334 May 27 15:43 snapshot.6100000001
-rw-r–r-- 1 cloudera-scm cloudera-scm 236601430 May 27 15:48 snapshot.6600000001
Connection broken for id 1, my id = 2, error =
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:979)
Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1063)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:73)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:898)
```c
三、日志分析
重启,是发送数据时,报错出异常。
说明最新日志有异常日志,导致数据恢复失败。
可以清理log日志保留snapshot日志看看(本场景的基于zk,用于hdfs,resourceManager做HA,数据丢失是不影响的)。
删log,重启,就恢复正常