zookeeper启动错误排查
今天在练习zookeeper时,清理data目录下的内容后;
清理内容(停止服务后清理):
version-2
zookeeper_server.pid
出现了启动错误,进行排查如下;
1、启动zk
[root@hdss-7-21 data]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/apache-zookeeper-3.5.8-bin/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@hdss-7-21 data]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/apache-zookeeper-3.5.8-bin/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Error contacting service. It is probably not running.
可以看到启动时:started,但是在查看状态时:
Error contacting service. It is probably not running.
2、查看日志文件如下
2021-03-11 15:17:21,230 [myid:0] - ERROR [main:QuorumPeer@937] - Unable to load database on disk
java.io.IOException: No snapshot found, but there are log entries. Something is broken!
at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:240)
at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:240)
at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:901)
at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:887)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:205)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:123)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:82)
2021-03-11 15:17:21,232 [myid:0] - ERROR [main:QuorumPeerMain@101] - Unexpected exception, exiting abnormally
java.lang.RuntimeException: Unable to run quorum server
at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:938)
at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:887)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:205)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:123)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:82)
Caused by: java.io.IOException: No snapshot found, but there are log entries. Something is broken!
at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:240)
at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:240)
at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:901)
... 4 more
3、原因
不应该只清理data目录下的文件,还应该将logs目录下的文件一并清理(主要是:version-2)
将data目录的文件(处理myid外)和logs下的文件都清理之后启动服务并查看状态,正常
[root@hdss-7-21 apache-zookeeper-3.5.8-bin]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/apache-zookeeper-3.5.8-bin/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@hdss-7-21 apache-zookeeper-3.5.8-bin]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/apache-zookeeper-3.5.8-bin/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Mode: follower
[root@hdss-7-21 apache-zookeeper-3.5.8-bin]# jps
11478 QuorumPeerMain
15400 Jps
4、连接客户端查看
[root@hdss-7-21 kafka]# zkCli.sh
...
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper]
可以看到zk中已经没有了其他数据,只有zk的节点的了

本文记录了一次ZooKeeper启动失败的排查过程,详细分析了启动错误的原因,并提供了有效的解决办法,即清理data和logs目录下的特定文件。
800

被折叠的 条评论
为什么被折叠?



