关于Hadoop-Streaming学习中碰到的问题

Hadoop在分布式计算方面很强大,而Python在文本处理也是相当方便,那么有这两者的结合吗?有,答案就是Hadoop-Streaming。Hadoop-Streaming可以将Hadoop与主流语言结合起来,使用方便,效果很好。个人觉得Pig在处理数据集时很不方便,特别是在计算百分比等运算时,而Hadoop-Streaming是可以替代Pig的。

1.Streaming固定的代码,该代码可以放在shell脚本中

 

RED_NUM=47

mapper_file="wordcount.py"
mapper_cmd="$mapper_file"
reducer_cmd="cat"

Hadoop="/usr/bin/hadoop"
Streaming="/usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2-cdh3u1.jar"

hadoop fs -rmr $OUTPUT
${Hadoop} jar ${Streaming} \
-D mapred.job.name="$JOBNAME" \
-D mapred.job.queue.name=$queuename \
-D mapred.map.tasks=500 \
-D mapred.min.split.size=1073741824 \
-D mapred.reduce.tasks="$RED_NUM" \
-D stream.num.map.output.key.fields=3 \
-D num.key.fields.for.partition=3 \
-input "$INPUT" \
-output "$OUTPUT" \
-mapper "$mapper_cmd" \
-reducer "$reducer_cmd" \
-file "$mapper_file" \
-partitioner org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner

有几个重要参数解释下

 

 

(1)-input:输入文件路径

(2)-output:输出文件路径

(3)-mapper:用户自己写的mapper程序,可以是可执行文件或者脚本

(4)-reducer:用户自己写的reducer程序,可以是可执行文件或者脚本

(5)-file:打包文件到提交的作业中,可以是mapper或者reducer要用的输入文件,如配置文件,字典等。

         这个一般是必须有的,因为mapper和reducer函数都是写在本地的文件中,因此需要将文件上传到集群中才能被执行

(6)-partitioner:用户自定义的partitioner程序

(7)-D:作业的一些属性(以前用的是-jonconf),具体有:
              1)mapred.map.tasks:map task数目  

          设置的数目与实际运行的值并不一定相同,若输入文件含有M个part,而此处设置的map_task数目超过M,那么实际运行map_task仍然是M

        2)mapred.reduce.tasks:reduce task数目  不设置的话,默认值就为1
              3)num.key.fields.for.partition=N:shuffle阶段将数据集的前N列作为Key;所以对于wordcount程序,map输出为“word  1”,shuffle是以word作为Key,因此这里N=1

(8)-D stream.num.map.output.key.fields=3 这个是指在reduce之前将数据按前3列做排序,一般情况下可以去掉

这里说一下,之前在执行Streaming-Python时,总是报找不到-file上传的文件,找这个错误花了好长时间;最后终于发现是文件格式的问题:我在Linux上写的mapper文件不知咋的变为dos类型文件格式:

于是将dos文件转换为UNIX文件:dos2unix  xxx.py

之后就上传成功!并成功执行!

2.MapReduce是万能的吗?

之前我以为是的,于是一碰到数据量大的就尝试使用MapReduce。但是最近的工作中发现有种情况并不适用于MapReduce,就是涉及到整个数据集的一些运算!

举个例子,在Wordcount中,如果要计算每个单词出现的词频占整个文档中单词词频的百分比,这时候写MapReduce程序就要注意,在每个reducer中“整个文档中单词词频”必须是真正的“整个文档中单词词频”;这个例子还比较特殊,涉及到整个数据集的仅仅是总词频,可以通过传入多个输入来解决。但是如果涉及到的一些运算必须要整个数据集,那么就没法解决了。或许你可以在每个mapper中输出为“flag    word    1”,然后num.key.fields.for.partition=1  这样看似可以解决,其实这样就将所有数据都reduce到一台机器上,成了单机计算,而数据集过大的话,单机是无法完成任务的。

3.Streaming程序中,当map或reduce中有一个是cat时

需要注意到底谁是cat,如果只是纯粹的过滤数据或打印,两种情况都行;但如果涉及到聚合操作,这里面就有区别了。

 

未完待续... ...

4.执行过程中,报“1: import: command not found”

这个就是Python文件的顶部没有写上执行方式:#!/usr/bin/python

 

5.执行过程中,报Broken Pipe错误:

2014-10-31 21:28:14,319 INFO [main] org.apache.hadoop.streaming.PipeMapRed: PipeMapRed exec [/usr/bin/python, getBidFeatureMap_suning.py]
2014-10-31 21:28:14,449 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s]
2014-10-31 21:28:14,450 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=10/0/0 in:NA [rec/s] out:NA [rec/s]
2014-10-31 21:28:14,473 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=100/0/0 in:NA [rec/s] out:NA [rec/s]
2014-10-31 21:28:14,516 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=155/0/0 in:NA [rec/s] out:NA [rec/s]
minRecWrittenToEnableSkip_=9223372036854775807 HOST=null
USER=lming_08
HADOOP_USER=null
last tool output: |null|

java.io.IOException: Broken pipe
	at java.io.FileOutputStream.writeBytes(Native Method)
	at java.io.FileOutputStream.write(FileOutputStream.java:345)
	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
	at java.io.DataOutputStream.write(DataOutputStream.java:107)
	at org.apache.hadoop.streaming.io.TextInputWriter.writeUTF8(TextInputWriter.java:72)
	at org.apache.hadoop.streaming.io.TextInputWriter.writeValue(TextInputWriter.java:51)
	at org.apache.hadoop.streaming.PipeMapper.map(PipeMapper.java:106)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)


刚开始以为是map程序中对空行未处理导致的,后来发现是map中有对一个上传的文件进行读取,而该文件超过100MB,在处理该文件时会执行10多亿次循环,导致超时了。

 

6.

SyntaxError: Non-ASCII character '\xe6' in file /hadoop/yarn/local/usercache/lming_08/appcache/application_1415110953023_38985/container_1415110953023_38985_01_000003/./get_visitor_company_info_map.py on line 8, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.impl.MetricsSystemImpl).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Python文件中出现中文,导致上传到服务器中无法执行
 
7.Streaming error: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
报code 1 错误可能是因为输出路径不合法,例如路径中含空格、tab等
 
 

 

7.如果想对第一列和第二列排序,其中第一列相同时,第二列逆排序

-D mapred.output.key.comparator.class=org.apache.hadoop.mapred.lib.KeyFieldBasedComparator \

-D mapred.text.key.comparator.options="-k1,1 -k2rn" \

-k1,1表示对第一列到第一列正排序(即对第一列正排序)

-k2rn 表示对第二列逆排序

http://stackoverflow.com/questions/20465438/hadoop-mapreduce-streaming-sorting-on-multiple-columns

8.报"Premature EOF from inputStream"错误

java.io.EOFException: Premature EOF from inputStream
	at com.hadoop.compression.lzo.LzopInputStream.readFully(LzopInputStream.java:75)
	at com.hadoop.compression.lzo.LzopInputStream.readHeader(LzopInputStream.java:114)
	at com.hadoop.compression.lzo.LzopInputStream.<init>(LzopInputStream.java:54)
	at com.hadoop.compression.lzo.LzopCodec.createInputStream(LzopCodec.java:83)
	at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:122)
	at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:169)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:346)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:166)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2037)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

有可能是因为部分lzo文件大小为0导致的,删掉就可以了。

未完待续... ...

参考资料:

http://dongxicheng.org/mapreduce/hadoop-streaming-programming/

http://hadoop.apache.org/docs/r0.19.1/cn/streaming.html#%E4%B8%80%E4%B8%AA%E5%AE%9E%E7%94%A8%E7%9A%84Partitioner%E7%B1%BB

                                  

 

DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. 2025-06-18 16:50:39,734 INFO datanode.DataNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting DataNode STARTUP_MSG: host = LAPTOP-FK5QKFGQ/192.168.10.1 STARTUP_MSG: args = [] STARTUP_MSG: version = 3.2.2 STARTUP_MSG: classpath = D:\pyspark\Hadoop\hadoop-3.2.2\etc\hadoop;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\accessors-smart-1.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\animal-sniffer-annotations-1.17.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\asm-5.0.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\audience-annotations-0.5.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\avro-1.7.7.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\checker-qual-2.5.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\commons-beanutils-1.9.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\commons-cli-1.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\commons-codec-1.11.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\commons-collections-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\commons-compress-1.19.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\commons-configuration2-2.1.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\commons-io-2.5.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\commons-lang3-3.7.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\commons-logging-1.1.3.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\commons-math3-3.1.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\commons-net-3.6.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\commons-text-1.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\curator-client-2.13.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\curator-framework-2.13.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\curator-recipes-2.13.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\dnsjava-2.1.7.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\error_prone_annotations-2.2.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\failureaccess-1.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\gson-2.2.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\guava-27.0-jre.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\hadoop-annotations-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\hadoop-auth-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\htrace-core4-4.1.0-incubating.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\httpclient-4.5.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\httpcore-4.4.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\j2objc-annotations-1.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jackson-annotations-2.9.10.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jackson-core-2.9.10.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jackson-core-asl-1.9.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jackson-databind-2.9.10.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jackson-jaxrs-1.9.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jackson-mapper-asl-1.9.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jackson-xc-1.9.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\javax.activation-api-1.2.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\javax.servlet-api-3.1.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jaxb-api-2.2.11.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jaxb-impl-2.2.3-1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jcip-annotations-1.0-1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jersey-core-1.19.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jersey-json-1.19.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jersey-server-1.19.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jersey-servlet-1.19.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jettison-1.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jetty-http-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jetty-io-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jetty-security-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jetty-server-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jetty-servlet-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jetty-util-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jetty-webapp-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jetty-xml-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jsch-0.1.55.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\json-smart-2.3.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jsp-api-2.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jsr305-3.0.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jsr311-api-1.1.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\jul-to-slf4j-1.7.25.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\kerb-admin-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\kerb-client-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\kerb-common-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\kerb-core-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\kerb-crypto-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\kerb-identity-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\kerb-server-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\kerb-simplekdc-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\kerb-util-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\kerby-asn1-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\kerby-config-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\kerby-pkix-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\kerby-util-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\kerby-xdr-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\log4j-1.2.17.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\metrics-core-3.2.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\netty-3.10.6.Final.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\nimbus-jose-jwt-7.9.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\paranamer-2.3.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\protobuf-java-2.5.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\re2j-1.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\slf4j-api-1.7.25.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\slf4j-log4j12-1.7.25.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\snappy-java-1.0.5.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\stax2-api-3.1.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\token-provider-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\woodstox-core-5.0.3.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\lib\zookeeper-3.4.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\hadoop-common-3.2.2-tests.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\hadoop-common-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\hadoop-kms-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\common\hadoop-nfs-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\accessors-smart-1.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\animal-sniffer-annotations-1.17.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\asm-5.0.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\audience-annotations-0.5.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\avro-1.7.7.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\checker-qual-2.5.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\commons-beanutils-1.9.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\commons-cli-1.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\commons-codec-1.11.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\commons-collections-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\commons-compress-1.19.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\commons-configuration2-2.1.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\commons-daemon-1.0.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\commons-io-2.5.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\commons-lang3-3.7.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\commons-logging-1.1.3.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\commons-math3-3.1.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\commons-net-3.6.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\commons-text-1.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\curator-client-2.13.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\curator-framework-2.13.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\curator-recipes-2.13.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\dnsjava-2.1.7.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\error_prone_annotations-2.2.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\failureaccess-1.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\gson-2.2.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\guava-27.0-jre.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\hadoop-annotations-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\hadoop-auth-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\htrace-core4-4.1.0-incubating.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\httpclient-4.5.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\httpcore-4.4.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\j2objc-annotations-1.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jackson-annotations-2.9.10.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jackson-core-2.9.10.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jackson-core-asl-1.9.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jackson-databind-2.9.10.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jackson-jaxrs-1.9.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jackson-mapper-asl-1.9.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jackson-xc-1.9.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\javax.activation-api-1.2.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\javax.servlet-api-3.1.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jaxb-api-2.2.11.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jaxb-impl-2.2.3-1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jcip-annotations-1.0-1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jersey-core-1.19.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jersey-json-1.19.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jersey-server-1.19.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jersey-servlet-1.19.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jettison-1.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jetty-http-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jetty-io-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jetty-security-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jetty-server-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jetty-servlet-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jetty-util-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jetty-util-ajax-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jetty-webapp-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jetty-xml-9.4.20.v20190813.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jsch-0.1.55.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\json-simple-1.1.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\json-smart-2.3.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jsr305-3.0.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\jsr311-api-1.1.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\kerb-admin-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\kerb-client-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\kerb-common-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\kerb-core-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\kerb-crypto-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\kerb-identity-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\kerb-server-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\kerb-simplekdc-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\kerb-util-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\kerby-asn1-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\kerby-config-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\kerby-pkix-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\kerby-util-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\kerby-xdr-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\leveldbjni-all-1.8.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\log4j-1.2.17.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\netty-3.10.6.Final.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\netty-all-4.1.48.Final.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\nimbus-jose-jwt-7.9.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\okhttp-2.7.5.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\okio-1.6.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\paranamer-2.3.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\protobuf-java-2.5.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\re2j-1.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\snappy-java-1.0.5.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\stax2-api-3.1.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\token-provider-1.0.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\woodstox-core-5.0.3.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\lib\zookeeper-3.4.13.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\hadoop-hdfs-3.2.2-tests.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\hadoop-hdfs-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\hadoop-hdfs-client-3.2.2-tests.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\hadoop-hdfs-client-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\hadoop-hdfs-httpfs-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\hadoop-hdfs-native-client-3.2.2-tests.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\hadoop-hdfs-native-client-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\hadoop-hdfs-nfs-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\hadoop-hdfs-rbf-3.2.2-tests.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\hdfs\hadoop-hdfs-rbf-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\aopalliance-1.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\bcpkix-jdk15on-1.60.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\bcprov-jdk15on-1.60.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\ehcache-3.3.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\fst-2.50.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\geronimo-jcache_1.0_spec-1.0-alpha-1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\guice-4.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\guice-servlet-4.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\HikariCP-java7-2.4.12.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\jackson-jaxrs-base-2.9.10.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\jackson-jaxrs-json-provider-2.9.10.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\jackson-module-jaxb-annotations-2.9.10.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\java-util-1.9.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\javax.inject-1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\jersey-client-1.19.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\jersey-guice-1.19.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\json-io-2.5.1.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\metrics-core-3.2.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\mssql-jdbc-6.2.1.jre7.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\objenesis-1.0.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\snakeyaml-1.16.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\lib\swagger-annotations-1.5.4.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-api-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-applications-distributedshell-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-applications-unmanaged-am-launcher-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-client-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-common-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-registry-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-server-applicationhistoryservice-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-server-common-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-server-nodemanager-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-server-resourcemanager-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-server-router-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-server-sharedcachemanager-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-server-tests-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-server-timeline-pluginstorage-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-server-web-proxy-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-services-api-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-services-core-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\yarn\hadoop-yarn-submarine-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\mapreduce\lib\hamcrest-core-1.3.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\mapreduce\lib\junit-4.11.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\mapreduce\hadoop-mapreduce-client-app-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\mapreduce\hadoop-mapreduce-client-common-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\mapreduce\hadoop-mapreduce-client-core-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\mapreduce\hadoop-mapreduce-client-hs-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\mapreduce\hadoop-mapreduce-client-hs-plugins-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\mapreduce\hadoop-mapreduce-client-jobclient-3.2.2-tests.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\mapreduce\hadoop-mapreduce-client-jobclient-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\mapreduce\hadoop-mapreduce-client-nativetask-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\mapreduce\hadoop-mapreduce-client-shuffle-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\mapreduce\hadoop-mapreduce-client-uploader-3.2.2.jar;D:\pyspark\Hadoop\hadoop-3.2.2\share\hadoop\mapreduce\hadoop-mapreduce-examples-3.2.2.jar STARTUP_MSG: build = Unknown -r 7a3bc90b05f257c8ace2f76d74264906f0f7a932; compiled by 'hexiaoqiao' on 2021-01-03T09:26Z STARTUP_MSG: java = 1.8.0_281 ************************************************************/ 2025-06-18 16:50:45,335 INFO checker.ThrottledAsyncChecker: Scheduling a check for [DISK]file:/D:/hadoop-3.2.2/data/datanode 2025-06-18 16:50:45,420 INFO impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties 2025-06-18 16:50:45,483 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s). 2025-06-18 16:50:45,484 INFO impl.MetricsSystemImpl: DataNode metrics system started 2025-06-18 16:50:46,677 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling 2025-06-18 16:50:46,689 INFO datanode.BlockScanner: Initialized block scanner with targetBytesPerSec 1048576 2025-06-18 16:50:46,692 INFO datanode.DataNode: Configured hostname is LAPTOP-FK5QKFGQ 2025-06-18 16:50:46,693 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling 2025-06-18 16:50:46,695 INFO datanode.DataNode: Starting DataNode with maxLockedMemory = 0 2025-06-18 16:50:46,709 INFO datanode.DataNode: Opened streaming server at /0.0.0.0:9866 2025-06-18 16:50:46,710 INFO datanode.DataNode: Balancing bandwidth is 10485760 bytes/s 2025-06-18 16:50:46,710 INFO datanode.DataNode: Number threads for balancing is 50 2025-06-18 16:50:46,741 INFO util.log: Logging initialized @7589ms to org.eclipse.jetty.util.log.Slf4jLog 2025-06-18 16:50:51,787 INFO server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets. 2025-06-18 16:50:51,821 INFO http.HttpRequestLog: Http request log for http.requests.datanode is not defined 2025-06-18 16:50:51,828 INFO http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) 2025-06-18 16:50:51,829 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context datanode 2025-06-18 16:50:51,829 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs 2025-06-18 16:50:51,830 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static 2025-06-18 16:50:51,848 INFO http.HttpServer2: Jetty bound to port 38751 2025-06-18 16:50:51,849 INFO server.Server: jetty-9.4.20.v20190813; built: 2019-08-13T21:28:18.144Z; git: 84700530e645e812b336747464d6fbbf370c9a20; jvm 1.8.0_281-b09 2025-06-18 16:50:51,865 INFO server.session: DefaultSessionIdManager workerName=node0 2025-06-18 16:50:51,865 INFO server.session: No SessionScavenger set, using defaults 2025-06-18 16:50:51,867 INFO server.session: node0 Scavenging every 660000ms 2025-06-18 16:50:51,874 INFO handler.ContextHandler: Started o.e.j.s.ServletContextHandler@2421cc4{logs,/logs,file:///D:/pyspark/Hadoop/hadoop-3.2.2/logs/,AVAILABLE} 2025-06-18 16:50:51,874 INFO handler.ContextHandler: Started o.e.j.s.ServletContextHandler@21ba0741{static,/static,file:///D:/pyspark/Hadoop/hadoop-3.2.2/share/hadoop/hdfs/webapps/static/,AVAILABLE} 2025-06-18 16:50:51,926 INFO util.TypeUtil: JVM Runtime does not support Modules 2025-06-18 16:50:51,932 INFO handler.ContextHandler: Started o.e.j.w.WebAppContext@43f82e78{datanode,/,file:///D:/pyspark/Hadoop/hadoop-3.2.2/share/hadoop/hdfs/webapps/datanode/,AVAILABLE}{file:/D:/pyspark/Hadoop/hadoop-3.2.2/share/hadoop/hdfs/webapps/datanode} 2025-06-18 16:50:51,939 INFO server.AbstractConnector: Started ServerConnector@1e097d59{HTTP/1.1,[http/1.1]}{localhost:38751} 2025-06-18 16:50:51,940 INFO server.Server: Started @12789ms 2025-06-18 16:50:52,540 INFO web.DatanodeHttpServer: Listening HTTP traffic on /0.0.0.0:9864 2025-06-18 16:50:52,545 INFO util.JvmPauseMonitor: Starting JVM pause monitor 2025-06-18 16:50:52,545 INFO datanode.DataNode: dnUserName = aaa 2025-06-18 16:50:52,546 INFO datanode.DataNode: supergroup = supergroup 2025-06-18 16:50:52,573 INFO ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue, queueCapacity: 1000, scheduler: class org.apache.hadoop.ipc.DefaultRpcScheduler, ipcBackoff: false. 2025-06-18 16:50:52,583 INFO ipc.Server: Starting Socket Reader #1 for port 9867 2025-06-18 16:50:52,720 INFO datanode.DataNode: Opened IPC server at /0.0.0.0:9867 2025-06-18 16:50:52,729 INFO datanode.DataNode: Refresh request received for nameservices: null 2025-06-18 16:50:52,735 INFO datanode.DataNode: Starting BPOfferServices for nameservices: <default> 2025-06-18 16:50:52,740 INFO datanode.DataNode: Block pool <registering> (Datanode Uuid unassigned) service to localhost/127.0.0.1:9000 starting to offer service 2025-06-18 16:50:52,745 INFO ipc.Server: IPC Server Responder: starting 2025-06-18 16:50:52,745 INFO ipc.Server: IPC Server listener on 9867: starting 2025-06-18 16:50:52,954 INFO datanode.DataNode: Acknowledging ACTIVE Namenode during handshakeBlock pool <registering> (Datanode Uuid unassigned) service to localhost/127.0.0.1:9000 2025-06-18 16:50:52,956 INFO common.Storage: Using 1 threads to upgrade data directories (dfs.datanode.parallel.volumes.load.threads.num=1, dataDirs=1) 2025-06-18 16:50:52,965 INFO common.Storage: Lock on D:\hadoop-3.2.2\data\datanode\in_use.lock acquired by nodename 16808@LAPTOP-FK5QKFGQ 2025-06-18 16:50:52,970 WARN common.Storage: Failed to add storage directory [DISK]file:/D:/hadoop-3.2.2/data/datanode java.io.IOException: Incompatible clusterIDs in D:\hadoop-3.2.2\data\datanode: namenode clusterID = CID-0243def2-304c-4ffd-871c-57b2cdf0182f; datanode clusterID = CID-a6ff55fc-9daf-4605-8a53-edaae5a9f8de at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:744) at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:294) at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:407) at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:387) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:559) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1748) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1684) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:392) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829) at java.lang.Thread.run(Thread.java:748) 2025-06-18 16:50:52,973 ERROR datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid cd899db0-fd95-4996-8250-261d1d36dbda) service to localhost/127.0.0.1:9000. Exiting. java.io.IOException: All specified directories have failed to load. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:560) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1748) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1684) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:392) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829) at java.lang.Thread.run(Thread.java:748) 2025-06-18 16:50:52,973 WARN datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid cd899db0-fd95-4996-8250-261d1d36dbda) service to localhost/127.0.0.1:9000 2025-06-18 16:50:52,974 INFO datanode.DataNode: Removed Block pool <registering> (Datanode Uuid cd899db0-fd95-4996-8250-261d1d36dbda) 2025-06-18 16:50:54,974 WARN datanode.DataNode: Exiting Datanode 2025-06-18 16:50:54,976 INFO datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at LAPTOP-FK5QKFGQ/192.168.10.1 ************************************************************/
最新发布
06-19
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值