提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档
前言
使用spark读取excel往数据库进行导数(现有插件spark-excel_2.12只支持读取xlsx),又因为数据源中存在大量xls类型的文件,需要开发一个功能将xls文件转为xlsx,来进行导数
一、使用步骤
1.引入依赖
代码如下(示例):
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>4.1.2</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml-schemas</artifactId>
<version>4.1.2</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>4.1.2</version>
</dependency>
2.转换操作
代码如下(示例):
/**
* xls 文件转换为xlsx文件
*/
public static String xls2xlsx(File sourceFile) throws IOException {
//创建hssworkbook 操作xls 文件
POIFSFileSystem fs = new POIFSFileSystem(new File(sourceFile.getPath()));
HSSFWorkbook hssfWorkbook = new HSSFWorkbook(fs);
//创建xssfworkbook 操作xlsx 文件
XSSFWorkbook workbook = new XSSFWorkbook();
int sheetNum = hssfWorkbook.getNumberOfSheets();
String xlsxPath = createNewXlsxFilePath(sourceFile);
for (int sheetIndex = 0; sheetIndex < sheetNum; sheetIndex++) {
HSSFSheet hssfSheet = hssfWorkbook.getSheetAt(sheetIndex);
if (workbook.getSheet(hssfSheet.getSheetName()) == null) {
XSSFSheet xssfSheet = workbook.createSheet(hssfSheet.getSheetName());
copySheets(hssfSheet, xssfSheet);
} else {
copySheets(hssfSheet, workbook.createSheet(hssfSheet.getSheetName()));
}
FileOutputStream fileOut = new FileOutputStream(xlsxPath);
//将复制的xls数据写入到新的xlsx文件中
workbook.write(fileOut);
workbook.close();
hssfWorkbook.close();
//删除原有的xls文件 直接删除有点暴力 可以考虑在另外的目录下保存一下
sourceFile.delete();
}
return xlsxPath;
}
//为xlsx创建路径
public static String createNewXlsxFilePath(File sourceFile){
String oldPath = sourceFile.getPath();
String newPath = oldPath.substring(0,oldPath.indexOf("."))+".xlsx";
return newPath;
}
/**
* 转换为xlsx --创建sheet
* @param source
* @param destination
*/
public static void copySheets(HSSFSheet source, XSSFSheet destination) {
int maxColumnNum = 0;
for (int i = source.getFirstRowNum(); i <= source.getLastRowNum(); i++) {
HSSFRow srcRow = source.getRow(i);
XSSFRow destRow = destination.createRow(i);
if (srcRow != null) {
// 拷贝行
copyRow(srcRow, destRow);
if (srcRow.getLastCellNum() > maxColumnNum) {
maxColumnNum = srcRow.getLastCellNum();
}
}
}
for (int i = 0; i <= maxColumnNum; i++) {
destination.setColumnWidth(i, source.getColumnWidth(i));
}
}
/**
* 转换xlsx -- 复制行
* @param srcRow
* @param destRow
*/
public static void copyRow( HSSFRow srcRow, XSSFRow destRow) {
// 拷贝行高
destRow.setHeight(srcRow.getHeight());
for (int j = srcRow.getFirstCellNum(); j <= srcRow.getLastCellNum(); j++) {
HSSFCell oldCell = srcRow.getCell(j);
XSSFCell newCell = destRow.getCell(j);
if (oldCell != null) {
if (newCell == null) {
newCell = destRow.createCell(j);
}
// 拷贝单元格
copyCell(oldCell, newCell);
}
}
}
/**
* 转换xlsx -- 复制单元格
* @param oldCell
* @param newCell
*/
public static void copyCell(HSSFCell oldCell, XSSFCell newCell) {
switch (oldCell.getCellType()) {
case STRING:
newCell.setCellValue(oldCell.getStringCellValue());
break;
case NUMERIC:
newCell.setCellValue(oldCell.getNumericCellValue());
break;
case BLANK:
newCell.setCellType(CellType.BLANK);
break;
case BOOLEAN:
newCell.setCellValue(oldCell.getBooleanCellValue());
break;
case ERROR:
newCell.setCellErrorValue(oldCell.getErrorCellValue());
break;
case FORMULA:
newCell.setCellFormula(oldCell.getCellFormula());
break;
default:
break;
}
}
注:如果需要考虑合并单元格,存在图片,复制表格样式等功能请参考
https://blog.youkuaiyun.com/sjyshine/article/details/121956034