Ways to write & read HDFS files

本文介绍了三种在Hadoop中操作HDFS文件的方法:使用FSDataOutputStream直接进行读写、利用BufferedReader/Writer进行文本处理及SequenceFile进行高效的数据序列化读写。这些方法适用于不同的场景需求。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Ways to write & read HDFS files

- Output Stream

 
FSDataOutputStream dos = fs.create(new Path("/user/tmp"), true); 
dos.writeInt(counter); 
dos.close();

- Buffered Writer/Reader
//Writer
BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(fs.create(new Path("/user/tmp"), true)));
bw.write(counter.toString());
bw.close();

//Reader
Configuration conf = context.getConfiguration();
FileSystem fs = FileSystem.get(conf);
 
DataInputStream d = new DataInputStream(fs.open(new Path(inFile)));
BufferedReader reader = new BufferedReader(new InputStreamReader(d));
while ((line = reader.readLine()) != null){
...
}
reader.close();
  

- SequenceFile Reader and Writer (I think most preferable way for Hadoop jobs):
//writer
SequenceFile.Writer writer = SequenceFile.createWriter(fs, conf, new Path(pathForCounters, context.getTaskAttemptID().toString()), Text.class, Text.class);
   writer.append(new Text(firtUrl.toString()+"__"+ context.getTaskAttemptID().getTaskID().toString()), new Text(counter+""));
   writer.close(); 

//reader
SequenceFile.Reader reader = new SequenceFile.Reader(fs, new Path(makeUUrlFileOffsetsPathName(FileInputFormat.getInputPaths(context)[0].toString())),  conf);
   while (reader.next(key, val)){
    offsets.put(key.toString(), Integer.parseInt(val.toString()));
   }

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值