其实MapFile就是一种经过排序后的SequenceFile,它的record包括index和data值,index作为文件的索引,作用是记录SequenceFile的key值,在每次加载时,先把index加载进入内存,所以这种方法比Sequence要快得多。实现细节基本和上一篇Sequence的原理一样,只不过它的对象是MapFile.Reader。具体实现代码如下:
/**
* MapFile的实现;
* MapFile是排序后的SequenceFile
* MapFile访问文件时,会首先把index加载到内存中,然后在根据index快速定位到文件所在位置,比起SequenceFile要快得多。
* MapFile写过程:
* ①创建Configure对象
* ②获取到FileSystem对象
* ③设置文件的输出路径
* ④创建MapFile.Writer对象
* */
public class MapFileDemo {
Configuration conf;
FileSystem fs;
@Before
public void run() throws IOException {
conf = new Configuration();
fs = FileSystem.get(conf);
}
@After
public void close() throws IOException{
fs.close();
}
/**
* 测试写MapFile
* */
@Test
public void WriteMap() throws IOException {
Path path = new Path("/0222/mymap.map");
Text key = new Text();
Text value = new Text();
key.set("key01");
value.set("it's my lay");
MapFile.Writer writer = new MapFile.Writer(conf, fs, path.toString(), Text.class, Text.class);
writer.append(key, value);
IOUtils.closeStream(writer);
}
/**
* 测试读MapFile
* */
@Test
public void readMap()throws IOException{
Path path = new Path("/0222/mymap.map");
MapFile.Reader reader = new MapFile.Reader(fs, path.toString(), conf);
Writable key = (Writable) ReflectionUtils.newInstance(reader.getKeyClass(), conf);
Writable value = (Writable) ReflectionUtils.newInstance(reader.getValueClass(), conf);
while(reader.next((WritableComparable) key, value)) {
System.out.println("key:" + key);
System.out.println("value:" + value);
}
}