java.io.IOException Failed to replace a bad datanode

在Hadoop集群中尝试追加文件时遇到IOException,原因是无法替换故障DataNode且无其他可用节点。解决方法是在hdfs-site.xml中设置dfs.client.block.write.replace-datanode-on-failure.policy为NEVER,避免添加新节点。理解Pipeline是数据在DataNode间传输的流程,涉及Packet的传递和ACK确认。
部署运行你感兴趣的模型镜像

java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try.

1.我在Hadoop,namenode节点里执行追加一个文件到已经存在文件的末尾,这样一个操作时

hadoop fs -appendToFile ./liubei.txt /sanguo/shuguo/text.txt

我遇到了以下的报错

java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try.

它的意思是,没有办法来替换一个坏掉的DataNode节点,因为在已经存在的流水线里没有一个好的DataNode节点可以来更换

2.出现原因

因为,我的集群共有3个DataNode节点,而我设置的默认副本数是3个。在执行写入到HDFS的操作时,当我的一台Datanode写入失败时,它要保持副本数为3,它就会去寻找一个可用的DataNode节点来写入,可是我的流水线上就只有3 台DataNode节点,所以导致会报错Failed to replace a bad datanode

3.怎么查看自己已经存在的副本数

查看Hadoop配置文件中的hdfs-site.xml(在hadoop目录下的etc/hadoop里)

如果有这几行代码,那就说明你的副本数是3。value标签就是副本数

		<property>
                <name>dfs.replication</name>
                <value>3</value>
        </property>

如果没有的话,那hadoop默认就是3个副本

4.怎么解决报错

在hdfs-site.xml 文件中添加下面几行代码

	   <property>
             <name>dfs.client.block.write.replace-datanode-on-failure.policy</name>
             <value>NEVER</value>
        </property>

参考apache官方文档得知 NEVER: never add a new datanode.

相当于 设置为NEVER之后,就不会添加新的DataNode

一般来说,集群中DataNode节点小于等于3 都不建议开启

5.什么是 Pipeline(流水线)?

pipeline:就是客户端向DataNode传输数据(Packet)和接受DataNode恢复(ACK)(Acknowledge)

整条流水线由若干个DataNode串联而成,数据由客户端流向PipeLine,在流水线上,假如DataNode A 比 DataNode B 更接近流水线

那么称A在B的上游(Upstream),称B在A的下游(Downstream)。

流水线上传输数据步骤

  1. 客户端向整条流水线的第一个DataNode发送Packet,第一个DataNode收到Packet就向下个DataNode转发,下游DataNode照做。

  2. 接收到Packet的DataNode将Packet数据写入磁盘

  3. 流水线上最后一个DataNode接收到Packet后向前一个DataNode发送ACK响应,表示自己已经收到Packet,上游DataNode照做

  4. 当客户端收到第一个DataNode的ACK,表明此次Packet的传输成功

image-20201006213506420

流水线更多知识,参考这篇文档https://www.cnblogs.com/lqlqlq/p/12321930.html

801)]

流水线更多知识,参考这篇文档https://www.cnblogs.com/lqlqlq/p/12321930.html

您可能感兴趣的与本文相关的镜像

ACE-Step

ACE-Step

音乐合成
ACE-Step

ACE-Step是由中国团队阶跃星辰(StepFun)与ACE Studio联手打造的开源音乐生成模型。 它拥有3.5B参数量,支持快速高质量生成、强可控性和易于拓展的特点。 最厉害的是,它可以生成多种语言的歌曲,包括但不限于中文、英文、日文等19种语言

hdfs dfs -appendToFile liubei.txt /sango/shuguo.txt 2025-12-11 13:15:50,183 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-12-11 13:15:50,281 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-12-11 13:15:50,305 INFO hdfs.DataStreamer: Exception in createBlockOutputStream blk_1073741970_1146 java.io.IOException: Got error, status=ERROR, status message , ack with firstBadLink as 192.168.60.104:9866 at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:110) at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1778) at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1507) at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1481) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:720) 2025-12-11 13:15:50,307 WARN hdfs.DataStreamer: Error Recovery for BP-2009562093-192.168.60.103-1764081848331:blk_1073741970_1146 in pipeline [DatanodeInfoWithStorage[192.168.60.103:9866,DS-966d51f4-9495-4561-b75b-b82362deb90f,DISK], DatanodeInfoWithStorage[192.168.60.104:9866,DS-3715d290-93de-4cc6-9db1-5e999a7b3a45,DISK]]: datanode 1(DatanodeInfoWithStorage[192.168.60.104:9866,DS-3715d290-93de-4cc6-9db1-5e999a7b3a45,DISK]) is bad. 2025-12-11 13:15:50,309 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-12-11 13:15:50,314 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-12-11 13:15:50,320 INFO hdfs.DataStreamer: Exception in createBlockOutputStream blk_1073741970_1146 java.io.IOException: Got error, status=ERROR, status message , ack with firstBadLink as 192.168.60.105:9866 at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:110) at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1778) at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1507) at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1481) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:720) 2025-12-11 13:15:50,320 WARN hdfs.DataStreamer: Error Recovery for BP-2009562093-192.168.60.103-1764081848331:blk_1073741970_1146 in pipeline [DatanodeInfoWithStorage[192.168.60.103:9866,DS-966d51f4-9495-4561-b75b-b82362deb90f,DISK], DatanodeInfoWithStorage[192.168.60.105:9866,DS-0b1c696d-a570-439d-a142-f43b6c9628b2,DISK]]: datanode 1(DatanodeInfoWithStorage[192.168.60.105:9866,DS-0b1c696d-a570-439d-a142-f43b6c9628b2,DISK]) is bad. 2025-12-11 13:15:50,323 WARN hdfs.DataStreamer: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[192.168.60.103:9866,DS-966d51f4-9495-4561-b75b-b82362deb90f,DISK]], original=[DatanodeInfoWithStorage[192.168.60.103:9866,DS-966d51f4-9495-4561-b75b-b82362deb90f,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1304) at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1372) at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1598) at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1499) at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1481) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:720) appendToFile: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[192.168.60.103:9866,DS-966d51f4-9495-4561-b75b-b82362deb90f,DISK]], original=[DatanodeInfoWithStorage[192.168.60.103:9866,DS-966d51f4-9495-4561-b75b-b82362deb90f,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. 依旧报错
最新发布
12-12
评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Panny范

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值