今天elasticsearch两个节点服务器异常掉电重启,遇到translog损坏的异常,将修复的过程记录下来。
1、问题
单机数据量有2亿+,一个index,20+个字段,使用bulk不停的写数据,bulk.size=5000,此时机器意外断电宕机。
机器修复后重启ES,出现translogCorruptedException异常:
[2018-04-18 16:29:25,950][WARN ][indices.cluster ] [elasticsearch_43_ssd] [ocslog-2018.04.18][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [ocslog-2018.04.18][0] failed to recover shard
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:287)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.index.translog.TranslogCorruptedException: translog corruption while reading from stream
at org.elasticsearch.index.translog.ChecksummedTranslogStream.read(ChecksummedTranslogStream.java:70)
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway

本文记录了一次因异常断电导致的Elasticsearch节点服务器translog损坏问题,详细描述了问题现象、解决步骤及translog的重要性。在问题发生后,通过关闭集群,删除损坏的translog恢复日志文件,然后重启集群,成功解决了TranslogCorruptedException异常。
最低0.47元/天 解锁文章
155

被折叠的 条评论
为什么被折叠?



