问题描述:
启动hadoop集群的时候出现如下问题:
[hadoop@node01 hadoop-2.6.0-cdh5.14.2]$ start-dfs.sh
Error: Could not find or load main class org.apache.hadoop.hdfs.tools.GetConf
Starting namenodes on []
node01: starting namenode, logging to /bigdata/install/hadoop-2.6.0-cdh5.14.2/logs/hadoop-hadoop-namenode-node01.out
node02: starting namenode, logging to /bigdata/install/hadoop-2.6.0-cdh5.14.2/logs/hadoop-hadoop-namenode-node02.out
node03: starting namenode, logging to /bigdata/install/hadoop-2.6.0-cdh5.14.2/logs/hadoop-hadoop-namenode-node03.out
node01: starting datanode, logging to /bigdata/install/hadoop-2.6.0-cdh5.14.2/logs/hadoop-hadoop-datanode-node01.out
node02: starting datanode, logging to /bigdata/install/hadoop-2.6.0-cdh5.14.2/logs/hadoop-hadoop-datanode-node02.out
node03: starting datanode, logging to /bigdata/install/hadoop-2.6.0-cdh5.14.2/logs/hadoop-hadoop-datanode-node03.out
Error: Could not find or load main class org.apache.hadoop.hdfs.tools.GetConf
[hadoop@node01 hadoop-2.6.0-cdh5.14.2]$ jps
16800 Jps
16634 DataNode
日志如下:
2020-09-20 23:59:32,776 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2020-09-20 23:59:32,776 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.io.IOException: NameNode is not formatted.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:232)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1150)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:797)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:844)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:823)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
2020-09-20 23:59:32,779 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2020-09-20 23:59:32,784 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at node01/192.168.52.111
************************************************************/
问题分析
这是因为Hadoop无法找到libexec里边的hdfs相关jar包。同时因为是在执行namenode过程中出现的,因此,我们修改namenode节点,即node01节点上libexec目录下的hadoop-config.sh,在最后一行添加如下代码:
CLASSPATH=${CLASSPATH}:$HADOOP_HDFS_HOME'/share/hadoop/hdfs/*'
然后,我们重新执行hdfs namenode -format,对namenode进行格式化。然而我们又发现存在libjvm.so未发现的问题,执行的命令如下
[hadoop@node01 native]$ ldd libhadoop.so.1.0.0
然后,我们使用find / -name libjvm.so
找到libjvm.so的路径,将其添加在/etc/profile文件中:
export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_HOME/jre/lib/amd64/server:$LD_LIBRARY_PATH
同时,我们也设置了如下的环境变量
export HADOOP_HOME=/bigdata/install/hadoop-2.6.0-cdh5.14.2
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
然后我们再次重新格式化namenode,此时成功格式化。
为了检验文件系统是否正常,我们使用hdfs dfs -mkdir创建目录,并上传文件。此时,却发生了ClusterID不一致的问题,于是,我们从日志中拷贝一份ClusterID
将复制的namenode的clusterId覆盖了出问题的datanode的clusterId。即
具体路径要依据hdfs-site.xml中的配置。
参考资料
- https://blog.youkuaiyun.com/yanghuadong_1992/article/details/106190794
- https://www.cnblogs.com/liuchangchun/p/4630305.html
- https://issues.apache.org/jira/browse/HDFS-1594
- https://www.cnblogs.com/wangshen31/p/9900987.html