POI实现word转html(带图片),实现word在线预览

该项目采用springboot和maven,前端使用ckeditor。由于html转word仅支持doc,需将word手动转为docx以处理图片。依赖包括poi和jsoup。word文件放在resources/static下,doc和docx格式可通过POI转换为html,前端可直接读取html字符串预览。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

项目后端使用了springboot,maven,前端使用了ckeditor富文本编辑器。目前从html转换的word为doc格式,而图片处理支持的是docx格式,所以需要手动把doc另存为docx,然后才可以进行图片替换。

一.添加maven依赖

主要使用了以下和poi相关的依赖,为了便于获取html的图片元素,还使用了jsoup:

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>3.14</version>
</dependency>

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-scratchpad</artifactId>
    <version>3.14</version>
</dependency>

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>3.14</version>
</dependency>

<dependency>
    <groupId>fr.opensagres.xdocreport</groupId>
    <artifactId>xdocreport</artifactId>
    <version>1.0.6</version>
</dependency>

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml-schemas</artifactId>
    <version>3.14</version>
</dependency>

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>ooxml-schemas</artifactId>
    <version>1.3</version>
</dependency>

<dependency>
    <groupId>org.jsoup</groupId>
    <artifactId>jsoup</artifactId>
    <version>1.11.3</version>
</dependency>

二.word转换为html
在springboot项目的resources目录下新建static文件夹,将需要转换的word文件temp.docx粘贴进去,由于static是springboot的默认资源文件,所以不需要在配置文件里面另行配置了,如果改成其他名字,需要在application.yml进行相应配置。

doc格式转换为html:
 

public static String docToHtml() throws Exception {
    File path = new File(ResourceUtils.getURL("classpath:").getPath());
    String imagePathStr = path.getAbsolutePath() + "\\static\\image\\";
    String sourceFileName = path.getAbsolutePath() + "\\static\\test.doc";
    String targetFileName = path.getAbsolutePath() + "\\static\\test2.html";
    File file = new File(imagePathStr);
    if(!file.exists()) {
        file.mkdirs();
    }
    HWPFDocument wordDocument = new HWPFDocument(new FileInputStream(sourceFileName));
    org.w3c.dom.Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
    WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(document);
    //保存图片,并返回图片的相对路径
    wordToHtmlConverter.setPicturesManager((content, pictureType, name, width, height) -> {
        try (FileOutputStream out = new FileOutputStream(imagePathStr + name)) {
            out.write(content);
        } catch (Exception e) {
            e.printStackTrace();
        }
        return "image/" + name;
    });
    wordToHtmlConverter.processDocument(wordDocument);
    org.w3c.dom.Document htmlDocument = wordToHtmlConverter.getDocument();
    DOMSource domSource = new DOMSource(htmlDocument);
    StreamResult streamResult = new StreamResult(new File(targetFileName));
    TransformerFactory tf = TransformerFactory.newInstance();
    Transformer serializer = tf.newTransformer();
    serializer.setOutputProperty(OutputKeys.ENCODING, "utf-8");
    serializer.setOutputProperty(OutputKeys.INDENT, "yes");
    serializer.setOutputProperty(OutputKeys.METHOD, "html");
    serializer.transform(domSource, streamResult);
    return targetFileName;
}

docx格式转换为html

public static String docxToHtml() throws Exception {
    File path = new File(ResourceUtils.getURL("classpath:").getPath());
    String imagePath = path.getAbsolutePath() + "\\static\\image";
    String sourceFileName = path.getAbsolutePath() + "\\static\\test.docx";
    String targetFileName = path.getAbsolutePath() + "\\static\\test.html";

    OutputStreamWriter outputStreamWriter = null;
    try {
        XWPFDocument document = new XWPFDocument(new FileInputStream(sourceFileName));
        XHTMLOptions options = XHTMLOptions.create();
        // 存放图片的文件夹
        options.setExtractor(new FileImageExtractor(new File(imagePath)));
        // html中图片的路径
        options.URIResolver(new BasicURIResolver("image"));
        outputStreamWriter = new OutputStreamWriter(new FileOutputStream(targetFileName), "utf-8");
        XHTMLConverter xhtmlConverter = (XHTMLConverter) XHTMLConverter.getInstance();
        xhtmlConverter.convert(document, outputStreamWriter, options);
    } finally {
        if (outputStreamWriter != null) {
            outputStreamWriter.close();
        }
    }
    return targetFileName;
}

转换成功后会生成对应的html文件,如果想在前端展示,直接读取文件转换为String返回给前端即可。

public static String readfile(String filePath) {
    File file = new File(filePath);
    InputStream input = null;
    try {
        input = new FileInputStream(file);
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    }
    StringBuffer buffer = new StringBuffer();
    byte[] bytes = new byte[1024];
    try {
        for (int n; (n = input.read(bytes)) != -1;) {
            buffer.append(new String(bytes, 0, n, "utf8"));
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
    return buffer.toString();
}

 

 

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值