I would like to convert either an html or xhtml document (preferably with styles) to Microsoft .doc and/or .docx format.
There seem to be plenty of examples for doing this the other way around but I haven't found any useful examples for converting to ms document formats.
Can anyone point me to an api or provide an example for doing this please
Many thanks
docx4j 2.8.0 supports converting XHTML documents and fragments to docx content. Disclosure: I wrote some of the code.
Yet another solution would be to use jodconverter which seems to basic html to doc conversion... it doesn't claim to do it well though
I tried docjx4j API 2.8.1 and it works like wonder. It had ConvertinXHTMLinFile and it works fine. If anyone wants the code I will post it.
Here is the link that helped me : ConvertInXHTMLFile
In order to work with Microsoft Documents you'll likely have to take a deeper look at Apache's POI Library.
Nevertheless creating .doc files with styling from (X)HTML requires some effort.
I've been spending a little time looking into docx4j. It seems to provide nice ways for creating html documents from docx but I can't see anything for the other way round.
At the moment this is still looking like the easiest method as it's just working with jaxb objects (I think).