org.apache.poi.xwpf.converter.xhtml.XHTMLConverter

2019-06-09 00:34发布

问题:

I am using org.apache.poi.xwpf.converter.xhtml.XHTMLConverter class to convert docx to html. Below is my groovy code

public Map convert(String wordDocPath, String htmlPath,
        Map conversionParams)
{
    log.info("Converting word file "+wordDocPath)
    try
    {
        ...
        String notificationWorkingFolder = "C:\tomcats\Notification\store\Notification1234"

        FileInputStream fis = new FileInputStream(wordDocPath);
        XWPFDocument document = new XWPFDocument(fis);
        XHTMLOptions options = XHTMLOptions.create().URIResolver(new FileURIResolver(new File(notificationWorkingFolder)));
        File htmlFile = new File(htmlPath);
        OutputStream out = new FileOutputStream(htmlFile)
        XHTMLConverter.getInstance().convert(document, out, options);

        log.info("Converted to HTML file "+htmlPath)

        return [success:true,htmlFileName:getFileName(htmlPath)]
    }
    catch(Exception e)
    {
        log.error("Exception :"+e.getMessage(),e)
        return [success:false]
    }

}

The above code is converting docx to html successfully, but if docx contains any images it puts <img src="C:\tomcats\Notification\store\Notification1234\word\media\image1.png"> but do not copy the image to that folder. As a result, when I open html tag, all images appears empty. Am I missing something in code? Is there a way to generate an image srouce link instead of absolute path, like <img src="http://localhost:8080/webapp/image1.png">

回答1:

I got answer for first question from this link lychaox.com/java/poi/Word07toHtml.html. I had to add one line of code options.setExtractor(new FileImageExtractor(imageFolderFile)); to generate images. Second question I resolved by pattern search and replacement.