【异常】Spark写入HBase时写入DataNode失败:dfs.client.block.write.replace-datanode-on-failure.policy

问题描述:

在SparkStreaming长时间写入HBase的时候,会下面的异常问题:

2017-12-24 23:20:34  [ SparkListenerBus:540107357 ] - [ ERROR ]  Listener EventLoggingListener threw an exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[ip:50010,DS-d9caacf5-a95a-45ab-8231-95decdbe4889,DISK], DatanodeInfoWithStorage[ip:50010,DS-7e2e14d9-3d8b-412d-bf38-3d2930a83d2f,DISK]], original=[DatanodeInfoWithStorage[ip:50010,DS-d9caacf5-a95a-45ab-8231-95decdbe4889,DISK], DatanodeInfoWithStorage[ip:50010,DS-7e2e14d9-3d8b-412d-bf38-3d2930a83d2f,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:1191)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1265)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1433)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1147)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:632)
2017-12-24 23:20:34  [ SparkListenerBus:540107357 ] - [ ERROR ]  Listener EventLoggingListener threw an exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Node
使用scala编写spark任务在华为云平台运行失败报错如下 2025-11-10 14:45:44,759 [main] INFO org.apache.spark.deploy.yarn.Client - Application report for application_1761562621540_474713 (state: RUNNING) 2025-11-10 14:45:45,763 [main] INFO org.apache.spark.deploy.yarn.Client - Application report for application_1761562621540_474713 (state: FINISHED) 2025-11-10 14:45:45,764 [main] INFO org.apache.spark.deploy.yarn.Client - client token: Token { kind: YARN_CLIENT_TOKEN, service: } diagnostics: User class threw exception: java.io.IOException: BulkLoad encountered an unrecoverable problem at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.bulkLoadPhase(LoadIncrementalHFiles.java:546) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.performBulkLoad(LoadIncrementalHFiles.java:475) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:379) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:293) at com.bmsoft.scala.hbase.HFileRDDHelper.saveAsHFile(HFileSupport.scala:208) at com.bmsoft.scala.hbase.HFileRDDSimpleS.toHBaseBulk(HFileSupport.scala:270) at com.bmsoft.scala.hbase.HbaseOperate$.toHbaseBulkS(HbaseOperate.scala:198) at com.bmsoft.operate.cust.DwsVsCustVoltratePeriodNew.outputData(DwsVsCustVoltratePeriodNew.scala:139) at com.bmsoft.operate.cust.DwsVsCustVoltratePeriodNew.executor(DwsVsCustVoltratePeriodNew.scala:55) at com.bmsoft.task.cust.DwsVsCustVoltratePeriodTask$.func(DwsVsCustVoltratePeriodTask.scala:17) at com.bmsoft.task.cust.DwsVsCustVoltratePeriodTask$.$anonfun$main$1(DwsVsCustVoltratePeriodTask.scala:29) at com.bmsoft.task.cust.DwsVsCustVoltratePeriodTask$.$anonfun$main$1$adapted(DwsVsCustVoltratePeriodTask.scala:29) at com.bmsoft.scala.utils.LeoUtils.package$.$anonfun$taskEntry_daYu$3(LeoUtils.scala:437) at scala.runtime.java8.JFunction1$mcVJ$sp.apply(JFunction1$mcVJ$sp.java:23) at scala.collection.immutable.NumericRange.foreach(NumericRange.scala:74) at com.bmsoft.scala.utils.LeoUtils.package$.taskEntry_daYu(LeoUtils.scala:424) at com.bmsoft.task.cust.DwsVsCustVoltratePeriodTask$.main(DwsVsCustVoltratePeriodTask.scala:29) at com.bmsoft.task.cust.DwsVsCustVoltratePeriodTask.main(DwsVsCustVoltratePeriodTask.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:733) Caused by: org.apache.hadoop.hbase.quotas.SpaceLimitingException: (unknown policy) Could not verify length of file to bulk load: hdfs://hacluster/tmp/dws_vs_cust_voltrate_period_9ec23c29-36e2-4245-b345-e1c97e1d2797/C/426d78deaae74dabb96fd367b18496d2 at org.apache.hadoop.hbase.quotas.policies.AbstractViolationPolicyEnforcement.getFileSize(AbstractViolationPolicyEnforcement.java:98) at org.apache.hadoop.hbase.quotas.policies.MissingSnapshotViolationPolicyEnforcement.computeBulkLoadSize(MissingSnapshotViolationPolicyEnforcement.java:53) at org.apache.hadoop.hbase.regionserver.RSRpcServices.bulkLoadHFile(RSRpcServices.java:2716) at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42280) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:455) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) Caused by: java.io.FileNotFoundException: File does not exist: hdfs://hacluster/tmp/dws_vs_cust_voltrate_period_9ec23c29-36e2-4245-b345-e1c97e1d2797/C/426d78deaae74dabb96fd367b18496d2 at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) at org.apache.hadoop.hbase.quotas.policies.AbstractViolationPolicyEnforcement.getFileSize(AbstractViolationPolicyEnforcement.java:95) ... 7 more at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:99) at org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:89) at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:367) at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:355) at org.apache.hadoop.hbase.client.SecureBulkLoadClient.secureBulkLoadHFiles(SecureBulkLoadClient.java:157) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$1.rpcCall(LoadIncrementalHFiles.java:578) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$1.rpcCall(LoadIncrementalHFiles.java:564) at org.apache.hadoop.hbase.client.RegionServerCallable.call(RegionServerCallable.java:127) at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:110) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.tryAtomicRegionLoad(LoadIncrementalHFiles.java:885) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.tryAtomicRegionLoad(LoadIncrementalHFiles.java:862) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.lambda$bulkLoadPhase$4(LoadIncrementalHFiles.java:524) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.quotas.SpaceLimitingException): org.apache.hadoop.hbase.quotas.SpaceLimitingException: NoQuota Could not verify length of file to bulk load: hdfs://hacluster/tmp/dws_vs_cust_voltrate_period_9ec23c29-36e2-4245-b345-e1c97e1d2797/C/426d78deaae74dabb96fd367b18496d2 at org.apache.hadoop.hbase.quotas.policies.AbstractViolationPolicyEnforcement.getFileSize(AbstractViolationPolicyEnforcement.java:98) at org.apache.hadoop.hbase.quotas.policies.MissingSnapshotViolationPolicyEnforcement.computeBulkLoadSize(MissingSnapshotViolationPolicyEnforcement.java:53) at org.apache.hadoop.hbase.regionserver.RSRpcServices.bulkLoadHFile(RSRpcServices.java:2716) at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42280) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:455) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) Caused by: java.io.FileNotFoundException: File does not exist: hdfs://hacluster/tmp/dws_vs_cust_voltrate_period_9ec23c29-36e2-4245-b345-e1c97e1d2797/C/426d78deaae74dabb96fd367b18496d2 at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) at org.apache.hadoop.hbase.quotas.policies.AbstractViolationPolicyEnforcement.getFileSize(AbstractViolationPolicyEnforcement.java:95) ... 7 more at org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:395) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:97) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:429) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:425) at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:118) at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:133) at org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.readResponse(NettyRpcDuplexHandler.java:162) at org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.channelRead(NettyRpcDuplexHandler.java:192) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) at org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at org.apache.hbase.thirdparty.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719) at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) at org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ... 1 more ApplicationMaster host: node-group-3ntAo0008 ApplicationMaster RPC port: 22867 queue: default start time: 1762751893580 final status: FAILED tracking URL: https://node-master5qydo:8090/proxy/application_1761562621540_474713/ user: zhangdacheng
最新发布
11-11
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值