Apache POI or docx4j for dealing with docx documen

2019-01-22 04:11发布

What do you think Which is better to use to read docx document as java objects and why ?

in other words. which library supports most of the word tags ?

4条回答
混吃等死
2楼-- · 2019-01-22 04:24

If you are dealing with docx document, docx4j is more convenient than Apache POI. You can use following links to learn basics of docx4j. Also, there is a nice forum of docx4j.

1.http://blog.iprofs.nl/2012/09/06/creating-word-documents-with-docx4j/ 2.http://www.smartjava.org/content/create-complex-word-docx-documents-programatically-docx4j?

查看更多
家丑人穷心不美
3楼-- · 2019-01-22 04:34

I tried Apache POI, but the problem is when printing anything from docx file (Ex: To print all "Heading1" elements from docx),it gets printed lots of bad data and whitespaces. Docx4j will avoid this bad data, I tried it.

查看更多
劫难
4楼-- · 2019-01-22 04:35

Disclosure: I lead the docx4j project

Although docx4j can also handle pptx and xlsx, it is mostly used for docx manipulation. By way of illustration, as at the time of writing, there are nearly 1000 topics in the docx4j forum. The pptx forum has only 10% of the volume.

Whatever you want to do with the docx document, docx4j ought to be able to help you. There's a single page overview of a generic workflow.

For many common requirements, docx4j provides higher level API. These include:

  • Create/open/save docx (of course)

  • Report/document generation, using a variety of approaches: (i) Variable substitution, (ii) XML data binding (particularly strong), and (iii) Mailmerge

  • Export as HTML, XHTML

  • Export as PDF (with font support)

For anything else, you can manipulate the JAXB representation of the docx to your heart's content. JAXB is a Java community standard, included in Java 6, and with a strong alternative implementation in EclipseLink's MOXy. (POI uses XML Beans instead of JAXB)

There's a web app to help you explore a docx, and generate Java code to create corresponding Java objects.

Of course, if there is some specific task you have in mind, it may be that docx4j or POI has a particular strength there.

Both docx4j and POI are ASL v2 licensed.

docx4j is actively maintained; its source code is on GitHub.

In addition, commercial support is available for docx4j if you want it, as are several commercial extensions eg MergeDocx.

docx4j does rely on POI as a library for its implementation of the OLE 2 Compound Document format, which we're grateful for.

查看更多
疯言疯语
5楼-- · 2019-01-22 04:37

I think Apache POI 's main focus is on dealing with spreadsheets though i has features to read word documents and it uses xml beans to do so. Docx4j mainly deals with docx documents using jaxb. Usually jaxb allows xml to java object conversion hence i think docx4j would be preferable for your case.

查看更多
登录 后发表回答