从HDFS集群中下载文件到本地:
错误的文件系统,但集群中实际是存在这个文件的,为什么找不到呢?原因是FileSystem没有真正拿到HDFS文件系统的实例对象,它不认识“hdfs://sempplsl-01:9000”这个信息,解决方法是将core-site.xml中配置的fs.defaultFS添加到代码的conf中。添加后执行成功。
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.commons.io.IOUtils;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class HdfsUtil {
public static void main(String args[]) throws IOException{
downloadFromHdfs();
}
// download a file from hdfs to local
public static void downloadFromHdfs() throws IOException {
Configuration conf=new Configuration();//read configuration file
//conf.set("fs.defaultFS", "hdfs://sempplsl-01:9000"); 没有这一项会报错
FileSystem fs = FileSystem.get(conf);//fs can be seen as a client of hdfs,it is used to read and write
//download file from hdfs
Path path=new Path("hdfs://sempplsl-01:9000/hadoop-2.4.1.tar.gz");
//get a stream from hdfs
FSDataInputStream input=fs.open(path);
// output the stream to local
FileOutputStream fo=new FileOutputStream("/home/hadoop/Downloads/hadoop.tgz");
IOUtils.copy(input,fo);
}
}
运行的时候报错:
2016-09-03 03:14:28,743 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable<span style="color:#FF0000;">
Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: hdfs://sempplsl-01:9000/hadoop-2.4.1.tar.gz, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:137)
at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:764)
at sempp.lsl.hadoop.hdfs.HdfsUtil.downloadFromHdfs(HdfsUtil.java:23)
at sempp.lsl.hadoop.hdfs.HdfsUtil.main(HdfsUtil.java:13)
</span>错误的文件系统,但集群中实际是存在这个文件的,为什么找不到呢?原因是FileSystem没有真正拿到HDFS文件系统的实例对象,它不认识“hdfs://sempplsl-01:9000”这个信息,解决方法是将core-site.xml中配置的fs.defaultFS添加到代码的conf中。添加后执行成功。
HDFS文件下载

1099

被折叠的 条评论
为什么被折叠?



