No live nodes contain current block. Will get new block locations from namenode and retry...

本文记录了HDFS和HBase使用过程中遇到的异常情况,包括无法连接DataNode、DataXceiver配置不当等问题,并提供了详细的解决方案,如调整DataXceiver参数、正确管理文件读取等。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1   多个用户操作hdfs和hbase时,出现以下异常,大概意思就是无法连接datanode,获取不到数据

INFO hdfs.DFSClient: Could not obtain block blk_-3181406624357578636_19200 from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry...
13/07/23 09:06:39 WARN hdfs.DFSClient: Failed to connect to /192.168.3.4:50010, add to deadNodes and continuejava.net.SocketException:

2    用hadoop fsck /  检查hdfs文件,结果是healthy,说明节点数据没问题,namenode和datanode应该也一致

3    查看datanode的log  

发现是DataXceiver的问题,DataXceiver的值大于4096了,所以无法提供读写,这个值以前改成4096,现在发现这个值还是太小了

在配置文件中改成:

<property>  
        <name>dfs.datanode.max.xcievers</name>  
        <value>12288</value>  
</property>

4    问题还是没有解决 接着datanode的log报如下错误

2012-06-18 17:47:13 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode DatanodeRegistration(x.x.x.x:50010, storageID=DS-196671195-10.10.120.67-50010-1334328338972, infoPort=50075, ipcPort=50020):DataXceiver
java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected 

https://issues.apache.org/jira/browse/HDFS-3555  这篇帖子说是客户端的问题,导致datanode无法向客户端写数据,重新检查代码,因为文件较多,发现每次读取数据都没有关闭文件

5  改好后,重启集群,发现hbase启动时无法加载META表

org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: .META.,,1

http://www.zihou.me/html/2013/06/27/8673.html 按照这篇帖子解决,重启后hbse没有问题

6   之后多线程读取数据时,一段时间后,仍有region无法提供服务,这种情况肯定是datanode的问题,但是DataXceiver已经改成很大,继续查看代码,每次读取数据时都获取了FileSystem实例,而且没有关闭,更改后,问题解决


总结:其实这个问题和集群没有什么关系DataXceiver设置成4096应该也没问题

自己的问题有2个

第一:每次读文件都没有关闭文件

第二:不要多次获取FileSystem实例

上传文件之后新的报错[root@7227030104gjx1 /]# hdfs dfs -put /words2.txt /jqe/wc/data 25/06/22 12:42:40 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /jqe/wc/data/words2.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1625) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3127) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3051) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:493) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2213) at org.apache.hadoop.ipc.Client.call(Client.java:1476) at org.apache.hadoop.ipc.Client.call(Client.java:1413) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy10.addBlock(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy11.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1588) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1373) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:554) put: File /jqe/wc/data/words2.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
06-23
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值