avro作为一种二进制存储,越来越广泛的应用我在使用中,遇到如下问题:我想在同一个文件不断的写入,但发现被写入的文件中每次都只有最新的写入记录。代码如下:(蓝颜色部分)public void writeToLocalFile(String namespace, Map<Integer, ItemCollection> itemCollectionMap) throws IOException { String filename = getLocalFileName(namespace + ".pre"); File file = new File(LOCAL_FILE_DIR + filename); DatumWriter<MonitorItemCollection> userDatumWriter = new SpecificDatumWriter<MonitorItemCollection>(MonitorItemCollection.class); DataFileWriter<MonitorItemCollection> dataFileWriter = new DataFileWriter<MonitorItemCollection>(userDatumWriter);dataFileWriter.create((new MonitorItemCollection()).getSchema(), file); List<Integer> timestampList = new ArrayList<Integer>(itemCollectionMap.keySet()); Collections.sort(timestampList); try { for (Integer timstamp : timestampList) { dfw.append(convertToMonitorItemType(itemCollectionMap.get(timstamp))); } } catch (Exception e) { logger.error("write local file [" + filename + "] error", e); } finally { dataFileWriter.close(); } }查看官方api发现create方法会每次创建一个新的文件,所以我的代码执行结束后,文件中只有最新一次的写入内容具体官方api如下:
后发现create方法有两种实现,一种是file参数(就是代码中应用的),还有一种是OutputStream方法,我在想OutputStream可以指定打开模式,是否把OutputStream设置为append模式就可以了,于是修改代码如下:public void writeToLocalFile(String namespace, Map<Integer, ItemCollection> itemCollectionMap) throws IOException { String filename = getLocalFileName(namespace + ".pre"); File file = new File(LOCAL_FILE_DIR + filename);OutputStream os = new FileOutputStream(file, true); // 设置为append模式 DatumWriter<MonitorItemCollection> userDatumWriter = new SpecificDatumWriter<MonitorItemCollection>(MonitorItemCollection.class); DataFileWriter<MonitorItemCollection> dataFileWriter = new DataFileWriter<MonitorItemCollection>(userDatumWriter);dataFileWriter.create((new MonitorItemCollection()).getSchema(), os); List<Integer> timestampList = new ArrayList<Integer>(itemCollectionMap.keySet()); Collections.sort(timestampList); try { for (Integer timstamp : timestampList) { dfw.append(convertToMonitorItemType(itemCollectionMap.get(timstamp))); } } catch (Exception e) { logger.error("write local file [" + filename + "] error", e); } finally { dataFileWriter.close(); } }结果发现,确实是文件中内容增多了,不再是每次创建了,但是解析不出来了,报sync错误再次查看官方api,发现appendTo方法,具体解释如下:最终修改代码如下:public void writeToLocalFile(String namespace, Map<Integer, ItemCollection> itemCollectionMap) throws IOException {
String filename = getLocalFileName(namespace + ".pre");
File file = new File(LOCAL_FILE_DIR + filename);
DatumWriter<MonitorItemCollection> userDatumWriter = new SpecificDatumWriter<MonitorItemCollection>(MonitorItemCollection.class);
DataFileWriter<MonitorItemCollection> dataFileWriter = new DataFileWriter<MonitorItemCollection>(userDatumWriter);
List<Integer> timestampList = new ArrayList<Integer>(itemCollectionMap.keySet());
Collections.sort(timestampList);
DataFileWriter<MonitorItemCollection> dfw = null;
if (!file.exists()) {
dfw = dataFileWriter.create((new MonitorItemCollection()).getSchema(), file);
} else {
dfw = dataFileWriter.appendTo(file);
}
try {
for (Integer timstamp : timestampList) {
dfw.append(convertToMonitorItemType(itemCollectionMap.get(timstamp)));
}
} catch (Exception e) {
e.printStackTrace();
logger.error("write local file [" + filename + "] error", e);
} finally {
dfw.close();
dataFileWriter.close();
}
}代码ok,实现了append功能,且能正常解析出来总结:avro的写入需要使用DataFileWriter的create方法和appendTo方法,create是创建一个新文件,appendTo是打开一个已存在的文件,这里需要注意一点,如果文件不存在,直接用appedto会报错
Avro 向已存在的文件中append数据
最新推荐文章于 2024-11-25 11:39:17 发布