Hadoop2.6下OIV源码分析
首先我们看下官方是如何介绍OIV的
The Offline Image Viewer is a tool to dump the contents of hdfs fsimage files to a human-readable format and provide read-only WebHDFS API in order to allow offline analysis and examination of an Hadoop cluster’s namespace. The tool is able to process very large image files relatively quickly. The tool handles the layout formats that were included with Hadoop versions 2.4 and up. If you want to handle older layout formats, you can use the Offline Image Viewer of Hadoop 2.3 or oiv_legacy Command. If the tool is not able to process an image file, it will exit cleanly. The Offline Image Viewer does not require a Hadoop cluster to be running; it is entirely offline in its operation.
可见在2.6.0版本中提供了两套实现以兼容早期Hadoop版本中fsImage的功能。通过查看hdfs脚本,可以看到不同的命令调用的主函数是不一样的
elif [ "$COMMAND" = "oiv" ] ;
then CLASS=org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewerPB
elif [ "$COMMAND" = "oiv_legacy" ] ;
then CLASS=org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer
OIV调用OfflineImageViewerPB,oiv_legacy调用了OfflineImageViewer
我们首先来分析下OfflineImageViewerPB处理逻辑,先上结论对于不需要交互或者不需要INode文件目录结构的的功能,那么通过调用FSImageUtil.loadSummary(file);得到10个sections后解析就可以(这里解析section是有顺序的,因为像String_Table保存的信息是用来还原INode内permission信息的)。具体的10个sections内容是什么可以参考这里。对于像web服务这种需要提供交互的功能,就需要使用FSImageLoader.load(inputFile)将文件目录结构,INode具体信息保存在内存中。
下面我抽XML和WEB两个具体展开来分析下。
OfflineImageViewerPB
这是处理OIV的主类,对命令参数进行解析,依据参数调用具体逻辑。最主要的方法run()代码如下:
public static int run(String[] args) throws Exception {
...
String inputFile = cmd.getOptionValue("i");
String processor = cmd.getOptionValue("p", "Web");
String outputFile = cmd.getOptionValue("o", "-");
...
Configuration conf = new Configuration();
try {
if (processor.equals("FileDistribution")) {
long maxSize = Long.parseLong(cmd.getOptionValue("maxSize", "0"));
int step = Integer.parseInt(cmd.getOptionValue("step", "0"));
new FileDistributionCalculator(conf, maxSize, step, out)
.visit(new RandomAccessFile(inputFile, "r"));
} else if (processor.equals("XML")) {
new PBImageXmlWriter(conf, out).visit(new RandomAccessFile(inputFile,
"r"));
} else if (processor.equals("ReverseXML")) {
try {
OfflineImageReconstructor.run(inputFile, outputFile);
} catch (Exception e) {
System.err.println("OfflineImageReconstructor failed: " +
e.getMessage());
e.printStackTrace(System.err);
System.exit(1);
}
} else if (processor.equals("Web")) {
String addr = cmd.getOptionValue("addr", "localhost:5978");
WebImageViewer viewer = new WebImageViewer(NetUtils.createSocketAddr
(addr));
try {
viewer.start(inputFile);
} finally {
viewer.close();
}
} else if (processor.equals("Delimited")) {
try (PBImageDelimitedTextWriter writer =
new PBImageDelimitedTextWriter(
new PrintStream(new WriterOutputStream(out)), delimiter, tempPath)) {
writer.visit(new RandomAccessFile(inputFile, "r"));
}
}else {
System.err.println("Invalid processor specified : " + processor);
printUsage();
return -1;
}
return 0;
} catch (EOFException e) {
System.err.println("Input file ended unexpectedly. Exiting");
} catch (IOException e) {
System.err.println("Encountered exception. Exiting: " + e.getMessage());
} finally {
IOUtils.cleanup(null, out);
}
return -1;
}
可以看出OIV目前支持FileDistribution、ReverseXML、Web、Delimited这四种功能,run()方法也很简单,就是解析命令。
XML
将FSImage文件内容转成XML格式使用的是PBImageXmlWriter类,最主要的是visit()方法,源码如下:
public void visit(RandomAccessFile file) throws IOException {
if (!FSImageUtil.checkFileFormat(file)) {
throw new IOException("Unrecognized FSImage");
}
FileSummary summary = FSImageUtil.loadSummary(file);
FileInputStream fin = null;
try {
fin = new FileInputStream(file.getFD());
out.print("<?xml version=\"1.0\"?>\n<fsimage>");
out.print("<version>");
o("layoutVersion", summary.getLayoutVersion());
o("onDiskVersion", summary.getOndiskVersion());
// Output the version of OIV (which is not necessarily the version of
// the fsimage file). This could be helpful in the case where a bug
// in OIV leads to information loss in the XML-- we can quickly tell
// if a specific fsimage XML file is affected by this bug.
o("oivRevision", VersionInfo.getRevision());
out.print("</version>\n");
ArrayList<FileSummary.Section> sections = Lists.newArrayList(summary
.getSectionsList());
Collections.sort(sections, new Comparator<FileSummary.Section>() {
@Override
public int compare(FileSummary.Section s1, FileSummary.Section s2) {
SectionName n1 = SectionName.fromString(s1.getName());
SectionName n2 = SectionName.fromString(s2.getName());
if (n1 == null) {
return n2 == null ? 0 : -1;
} else if (n2 == null) {
return -1;
} else {
return n1.ordinal() - n2.ordinal();
}
}
});
for (FileSummary.Section s : sections) {
fin.getChannel().position(s.getOffset());
InputStream is = FSImageUtil.wrapInputStreamForCompression(conf,
summary.getCodec(), new BufferedInputStream(new LimitInputStream(
fin, s.getLength())));
switch (SectionName.fromString(s.getName())) {