Hadoop集群多次格式化导致容量分配为0不能导入数据

目录

遇到问题

解决问题

验证处理情况

查看集群状态


遇到问题

使用hadoop  fs  -put 命令上传文件发现失败,报了以下错误:

03/01/19 15:18:03 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /home/input/file1.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 2 datanode(s) running and no node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
        at org.apache.hadoop.ipc.Client.call(Client.java:1347)
        at org.apache.hadoop.ipc.Client.call(Client.java:1300)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        at $Proxy9.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at $Proxy9.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)

上面的问题是由于没有分配有效的存储容量,hadoop dfsadmin -report 查看错误日志:

[root@master sbin]# hadoop dfsadmin -report
WARNING: Use of this script to execute dfsadmin is deprecated.
WARNING: Attempting to execute replacement "hdfs dfsadmin" instead.

Configured Capacity: 0 (0 B)
Present Capacity: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used: 0 (0 B)
DFS Used%: 0.00%
Replicated Blocks:
	Under replicated blocks: 0
	Blocks with corrupt replicas: 0
	Missing blocks: 0
	Missing blocks (with replication factor 1): 0
	Pending deletion blocks: 0
Erasure Coded Block Groups: 
	Low redundancy block groups: 0
	Block groups with corrupt internal blocks: 0
	Missing block groups: 0
	Pending deletion blocks: 0

解决问题

  1. 到sbin目录停止集群
    ./stop-all.sh

     

  2. 清理各个集群节点的数据目录(都需要清理否则还是不能分配)
    rm -rf  /root/hadoop/tmp/*
    rm -rf  /root/hadoop/var/*
    rm -rf  /root/hadoop/dfs/name/*
    rm -rf  /root/hadoop/dfs/data/*

     

  3. 到bin目录重新格式化
    ./hadoop namenode -format

     

  4. 重新启动集群
    ./start-all.sh

     

验证处理情况

[root@master sbin]# hadoop dfsadmin -report
WARNING: Use of this script to execute dfsadmin is deprecated.
WARNING: Attempting to execute replacement "hdfs dfsadmin" instead.

Configured Capacity: 79401328640 (73.95 GB)
Present Capacity: 68053057536 (63.38 GB)
DFS Remaining: 68053049344 (63.38 GB)
DFS Used: 8192 (8 KB)
DFS Used%: 0.00%
Replicated Blocks:
	Under replicated blocks: 0
	Blocks with corrupt replicas: 0
	Missing blocks: 0
	Missing blocks (with replication factor 1): 0
	Pending deletion blocks: 0
Erasure Coded Block Groups: 
	Low redundancy block groups: 0
	Block groups with corrupt internal blocks: 0
	Missing block groups: 0
	Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (2):

Name: 192.168.1.11:9866 (slave1)
Hostname: slave1
Decommission Status : Normal
Configured Capacity: 39700664320 (36.97 GB)
DFS Used: 4096 (4 KB)
Non DFS Used: 5674192896 (5.28 GB)
DFS Remaining: 34026467328 (31.69 GB)
DFS Used%: 0.00%
DFS Remaining%: 85.71%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Jan 03 02:14:20 EST 2019
Last Block Report: Thu Jan 03 02:13:55 EST 2019
Num of Blocks: 0


Name: 192.168.1.12:9866 (slave2)
Hostname: slave2
Decommission Status : Normal
Configured Capacity: 39700664320 (36.97 GB)
DFS Used: 4096 (4 KB)
Non DFS Used: 5674078208 (5.28 GB)
DFS Remaining: 34026582016 (31.69 GB)
DFS Used%: 0.00%
DFS Remaining%: 85.71%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Jan 03 02:14:20 EST 2019
Last Block Report: Thu Jan 03 02:13:55 EST 2019
Num of Blocks: 0


[root@master sbin]# 

可见已分配容量。 

查看集群状态

 http://192.168.1.10:8088/cluster/nodes

http://192.168.1.10:50070/dfshealth.html#tab-overview

Hadoop集群安装实例:https://blog.youkuaiyun.com/boonya/article/details/80719245 

### Hadoop 数据采集方法 在利用Hadoop进行数据采集的过程中,通常会采用两种主要方式:批量导入和实时流式传输。对于社交媒体或电商交易平台这类持续产生新数据的应用场景来说,选择合适的数据采集策略至关重要。 #### 批量导入 当目标是从已有的文件系统或其他数据库中迁移历史数据Hadoop集群时,可以考虑使用`Sqoop`工具来进行结构化数据的高效转移[^1]。此外,针对非结构化的文本、图片等内容,则可以通过编写自定义程序读取本地磁盘上的文件并上传至HDFSHadoop Distributed File System),从而完成一次性大批量的数据加载操作。 #### 实时流式传输 为了捕捉最新的动态变化,如社交平台上用户的即时互动行为或者电商平台发生的每一笔订单记录,推荐部署Apache Flume或Kafka Connect服务作为消息队列中介层,负责监听源头事件的发生并将之转化为适合写入HBase表单或是追加到现有日志文件中的格式再传递给下游消费者——即运行于YARN资源调度器管理下的Spark Streaming作业实例来进一步加工处理[^2]。 --- ### Vue 数据可视化实现方案 构建基于Vue.js框架的数据可视化前端界面涉及多个层面的技术选型和技术栈组合: #### 组件库的选择 考虑到开发效率与用户体验之间的平衡点,建议优先选用Element UI 或 Ant Design of Vue 这样的成熟UI组件库,它们不仅提供了丰富的图表控件选项,还具备良好的跨浏览器兼容性和响应式布局特性,有助于加速项目迭代周期的同时保障最终产品的质量稳定性[^3]。 #### 图形渲染引擎集成 ECharts 是由百度团队维护的一个开源JavaScript图表库,因其出色的性能优化能力和高度定制化的绘图功能而广受好评;另一方面,D3.js则凭借其灵活多变的操作API以及深入浅出的学习曲线同样赢得了众多开发者青睐。无论是哪种选择都可以很好地嵌入到Vue项目的生命周期钩子函数内执行初始化配置工作,并根据实际业务需求调用相应的方法绘制静态/动态图形元素。 ```javascript // ECharts 示例代码片段 import * as echarts from 'echarts'; export default { mounted() { const chartDom = document.getElementById('main'); var myChart = echarts.init(chartDom); var option; option = { title: { text: '某站点用户访问来源' }, tooltip: {}, legend: { data: ['流量'] }, xAxis: {data: ["直接访问","邮件营销","联盟广告"]}, yAxis: {}, series: [{ name: '流量', type: 'bar', data: [5, 20, 36] }] }; option && myChart.setOption(option); } } ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值