Hadoop 读取文件API报错

最新推荐文章于 2023-09-19 14:18:27 发布

转载最新推荐文章于 2023-09-19 14:18:27 发布 · 308 阅读

文章标签：

在使用Hadoop集群进行文件读取时遇到BlockMissingException错误，并通过调整代码和环境配置成功解决。

Exception in thread "main" org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1124468226-10.0.2.15-1429879726015:blk_1073742186_1370 file=/user/testdir/yarn-site.xml
    at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:889)
    at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:998)
    at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1333)
    at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:78)
    at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:116)
    at com.hdfs.file.ReadFile.main(ReadFile.java:24)、

代码如下，在windows机器上运行，报上面的错误：

package com.hdfs.file;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

public class ReadFile {

    /**
     * @param args
     * @throws Exception 
     */
    public static void main(String[] args) throws Exception {
        // TODO Auto-generated method stub
        Configuration conf=new Configuration();
        FileSystem fs=FileSystem.get(conf);
        Path path=new Path("hdfs://sandbox.hortonworks.com:8020/user/testdir/yarn-site.xml");
        if(fs.exists(path)){
            FSDataInputStream fsIn=fs.open(path);
            FileStatus status=fs.getFileStatus(path);
            byte[] buffer=new byte[Integer.parseInt(String.valueOf(status.getLen()))];
            fsIn.readFully(0,buffer);
            fsIn.close();
            fs.close();
            System.out.println("读取完成!");
            System.out.println(new String(buffer));
        }else{
            throw new Exception("the file is not found!");
        }
    }

}