Ways to write & read HDFS files
- Output Stream
FSDataOutputStream dos = fs.create(new Path("/user/tmp"), true);
dos.writeInt(counter);
dos.close();
- Buffered Writer/Reader
//Writer
BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(fs.create(new Path("/user/tmp"), true)));
bw.write(counter.toString());
bw.close();
//Reader
Configuration conf = context.getConfiguration();
FileSystem fs = FileSystem.get(conf);
DataInputStream d = new DataInputStream(fs.open(new Path(inFile)));
BufferedReader reader = new BufferedReader(new InputStreamReader(d));
while ((line = reader.readLine()) != null){
...
}
reader.close();
- SequenceFile Reader and Writer (I think most preferable way for Hadoop jobs):
//writer
SequenceFile.Writer writer = SequenceFile.createWriter(fs, conf, new Path(pathForCounters, context.getTaskAttemptID().toString()), Text.class, Text.class);
writer.append(new Text(firtUrl.toString()+"__"+ context.getTaskAttemptID().getTaskID().toString()), new Text(counter+""));
writer.close();
//reader
SequenceFile.Reader reader = new SequenceFile.Reader(fs, new Path(makeUUrlFileOffsetsPathName(FileInputFormat.getInputPaths(context)[0].toString())), conf);
while (reader.next(key, val)){
offsets.put(key.toString(), Integer.parseInt(val.toString()));
}