I'm trying to get the style information from an MS docx file, I have no problem writing file content with added styles like bold, italic. font size etc, but reading the file content and getting the style information is not so clear. I've tried using XWPFDocument, this API does not seem to have the ability to read the styles. I'm now trying XWPFWordExtractor which seems a bit more promising but I'm still stuck getting the style information for the text.
The type of content I reading looks similar to the following.
"Hello, this is bold text and this is italic text abd this is bold-italic text"
Any pointers to an example would be great.
I gave up trying to use Apache poi, I found another lib called docx4j, this seems to do what I need, the properties I want to look at a now available, once the docx file is loaded you can view the content of the file in an xml format like below.
`
`
Okay, so based on the comments from Gagravarr, the solution is below, exactly as I wanted. So basically Gagravarr answered the question but I'm not sure how apart from saying it hear to give him credit.
`
Output below
Current run IsBold : false Current run IsItalic : false "Hello, this is Current run IsBold : true Current run IsItalic : false bold text Current run IsBold : false Current run IsItalic : false and this is Current run IsBold : false Current run IsItalic : true italic text Current run IsBold : false Current run IsItalic : false a Current run IsBold : false Current run IsItalic : false n Current run IsBold : false Current run IsItalic : false d this is Current run IsBold : true Current run IsItalic : true bold-italic text Current run IsBold : false Current run IsItalic : false "
I found a very nice way to copy styles from one document to another. It is not as direct as I would have hoped but it works.
Copy the styles into your output document with the following code
The same approach works for copying list formats
Here is a very good way to copy styles from another document. A little background; a docx file is really a zip file of a number of xml files including styles.xml. In the following code sample I read numberin.xml, parse it into a CTStyles object then set it in the current document. Here is most of the code. You can use the same approach to copy numbering.xml for your Word numbering.
you can use
paragraph.getCTP().getPPr().getRPr().isSetB()