Namenode停止报错 Error: flush failed for required journal

本文详细解析了Hadoop集群中Namenode突然停止并报错的故障原因,阐述了如何实现高可用(HA)集群,特别强调了JournalNodeQuorum的重要性及其配置方法。此外,文章还提供了优化策略,包括调整JournalNode写入超时时间、调整Namenode的Java参数以减少full gc时间和更改为CMS垃圾回收模式等。
hadoop集群主Namenode突然停止,报错如下:
2016-03-23 17:12:25,877 INFO  namenode.FSEditLog (FSEditLog.java:endCurrentLogSegment(1153)) - Ending log segment 574144342
2016-03-23 17:12:26,350 WARN  client.QuorumJournalManager (QuorumCall.java:waitFor(134)) - Waited 19047 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so far: [192.168.14.16:8485]
2016-03-23 17:12:27,304 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(364)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [192.168.14.14:8485, 192.168.14.15:8485, 192.168.14.16:8485], stream=QuorumOutputStream starting at txid 574144342))
java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.
        at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
        at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
        at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:499)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:359)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:495)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:623)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInt(FSNamesystem.java:3188)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameTo(FSNamesystem.java:3149)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rename(NameNodeRpcServer.java:701)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.rename(ClientNamenodeProtocolServerSideTranslatorPB.java:523)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
2016-03-23 17:12:27,304 WARN  client.QuorumJournalManager (QuorumOutputStream.java:abort(72)) - Aborting QuorumOutputStream starting at txid 574144342
2016-03-23 17:12:27,308 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2016-03-23 17:12:27,313 INFO  namenode.NameNode (StringUtils.java:run(640)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at nn01
************************************************************/


实现HA的集群变成必须依赖于JournalNode Quorum才能正常工作。

在这一点上,和HBase对Zookeeper的依赖有点类似。如果NameNode无法获取JournalNode Quorum,HDFS则会无法格式化或无法启动,会提示如下错误信息:
这些JournalNode的负载不大,建议是可以运行在Master daemon的机器上。

在配置方面,除了常规HA外,需要指定JournalNode Quorum和JournalNode用于存储的目录位置。这两者分别通过“dfs.namenode.shared.edits.dir”和“dfs.journalnode.edits.dir”来指定。

以上内容的报错,应该就是找不到JournalNode的原因,
在写journalnode超时时,触发了 ExitUtil类的terminate 方法,终止当前的进程:

JournalSet类中:


  for (JournalAndStream jas :  journals) {
      try {
        closure.apply(jas);
      }  catch (Throwable t) {
         if (jas.isRequired()) {
           final String msg = "Error: "  + status +  " failed for required journal ("
            + jas +  ")" ;
           LOG .fatal(msg, t);
           // If we fail on *any* of the required journals, then we must not
           // continue on any of the other journals. Abort them to ensure that
           // retry behavior doesn't allow them to keep going in any way.
          abortAllJournals();
           // the current policy is to shutdown the NN on errors to shared edits
           // dir  . There are many code paths to shared edits failures - syncs ,
           // roll of edits etc. All of them go through this common function
           // where the isRequired() check is made. Applying exit policy here
           // to catch all code paths.
          terminate(1, msg);
        }  else {
           LOG .error("Error: "  + status +  " failed for (journal " + jas + ")"  , t);
          badJAS.add(jas);         
        }
      }

    }

ExitUtil类的terminate方法,调用了System.exit方法:


 /**
   * Terminate the current process. Note that terminate is the *only* method
   * that should be used to terminate the daemon processes.
   * @param status exit code
   * @param msg message used to create the ExitException
   * @throws ExitException if System.exit is disabled for test purposes
   */
  public static void terminate(int status, String msg) throws ExitException {
    LOG.info( "Exiting with status " + status);
    if (systemExitDisabled) {
      ExitException ee = new ExitException(status, msg);
      LOG.fatal( "Terminate called", ee);
      if (null == firstExitException) {
        firstExitException = ee;
      }
      throw ee;
    }
    System.exit(status);

  }

最后启动下Namenode就好了,所以hadoop有HA还是很有必要的,一个Namenode停止不影响集群。

可选优化方法(我配置了第1项):

1)调节journalnode 的写入超时时间 dfs.qjournal.write-txns.timeout.ms参数

其实在实际的生产环境中,也很容易发生类似的这种超时情况,所以我们需要把默认的20s超时改成更大的值,比如60或者90s。

我们可以在hadoop/etc/hadoop下的hdfs-site.xml中,加入一组配置:
<property>
        <name>dfs.qjournal.write-txns.timeout.ms</name>
        <value>60000</value>

</property>

从别人博客中看到的配置方法,神奇的是,hadoop的官网中的关于hdfs-site.xml介绍中,居然找不到关于这个配置的说明

http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml


2)调整namenode 的java参数,提前触发 full gc,这样full gc 的时间就会小一些。

3)默认namenode的fullgc方式是parallel gc,是stw模式的,更改为cms的格式。调整namenode的启动参数:
-XX:+UseCompressedOops
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled
-XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0
-XX:+CMSParallelRemarkEnabled -XX:+DisableExplicitGC
-XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75
-XX:SoftRefLRUPolicyMSPerMB=0
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值