将docx文件转成html
docx文档的转换是想把转化的html代码输出到服务器的html文件中,之后再读取这个html代码返回前端。
/**
* 将docx文件转成html.
* <p>
*
* @Title: docxToHtml
* </p>
* <p>
* @Description:
* </p>
*
* @param inputStream docx文件流.
* @return String
*/
public static String docxToHtml(InputStream inputStream) {
String docxContent = null;
OutputStreamWriter out = null;
File file = null;
try {
file = File.createTempFile("temp_html", ".html");
final String filePath = file.getPath();
// 读取docx文档
XWPFDocument document = new XWPFDocument(inputStream);
// 输出docx文档
XHTMLOptions options = XHTMLOptions.create();
options.setIgnoreStylesIfUnused(false);
options.setFragment(true);
out = new OutputStreamWriter(new FileOutputStream(filePath), "UTF-8");
XHTMLConverter xhtmlConverter = (XHTMLConverter) XHTMLConverter.getInstance();
xhtmlConverter.convert(document, out, options);
docxContent = readfile(filePath);
} catch (Exception e) {
e.printStackTrace();
} finally {
if (out != null) {
try {
out.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return docxContent;
}
将doc文件转成html
相对于docx文档,doc文件的读取比较复杂,也可将doc文档的图片保存的本地的相对路径或者保存到服务器,然后返回图片路径,html代码也可以正常显示。
/**
* 将doc文件转成html.
* <p>
*
* @Title: docToHtml
* </p>
* <p>
* @Description:
* </p>
*
* @param inputStream docx文件流.
* @return String
* @throws IOException
* @throws ParserConfigurationException
*/
public static String docToHtml(InputStream inputStream) {
HWPFDocument wordDocument;
try {
// 读取word文档
wordDocument = new HWPFDocument(inputStream);
// word文档转化为html
Document newDocument = XMLHelper.getDocumentBuilderFactory().newDocumentBuilder().newDocument();
WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(newDocument);
wordToHtmlConverter.processDocument(wordDocument);
// wordToHtmlConverter.setPicturesManager((content, pictureType, suggestedName,widthInches, heightInches) -> {
// // content是图片内容, 这里将图片保存到图片服务器, 然后将保存的图片路径返回
// // 同样, 你也可以将图片保存到本地相对路径, 然后将相对路径返回, html也能正常显示图片
// return "https://csdnimg.cn/pubfooter/images/csdn_cs_qr.png";
// });
Transformer transformer = TransformerFactory.newInstance().newTransformer();
// 指定Transformer在输出结果树时是否可以添加额外的空格
transformer.setOutputProperty(OutputKeys.INDENT, "no");
// 指定输出编码
transformer.setOutputProperty(OutputKeys.ENCODING, "utf-8");
transformer.setOutputProperty(OutputKeys.METHOD, "html");
// 这里是希望将转换后的流最终输出到字符串进行返回, 如果你希望直接输出文件, 你可以创建一个文件流放进下面的参数
StringWriter stringWriter = new StringWriter();
transformer.transform(new DOMSource(wordToHtmlConverter.getDocument()), new StreamResult(stringWriter));
return stringWriter.toString();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (TransformerConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (TransformerFactoryConfigurationError e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (TransformerException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return null;
}