还是以Quickstart为例,说一下forward index的创建过程。
收集各个column的统计数据
代码同dictionary index。
再次遍历,按行处理每列的索引
iterator复位
// Build the index recordReader.rewind();
重新遍历,对每行索引
LOGGER.info("Start building IndexCreator!"); while (recordReader.hasNext()) { long start = System.currentTimeMillis(); GenericRow row = recordReader.next(); long stop = System.currentTimeMillis(); indexCreator.indexRow(row); long stop1 = System.currentTimeMillis(); totalRecordReadTime += (stop - start); totalIndexTime += (stop1 - stop); }
indexRow的实现:
@Override public void indexRow(GenericRow row) { for (final String column : dictionaryCreatorMap.keySet()) { Object columnValueToIndex = row.getValue(column); Object dictionaryIndex; if (dictionaryCache.get(column).containsKey(columnValueToIndex)) { dictionaryIndex = dictionaryCache.get(column).get(columnValueToIndex