Html To Doc(Word) Or RTF Format

2019-08-16 08:45发布

问题:

What would the best possible way to convert a html page (with css, tables, images etc.) to be converted to word or rtf format. I already know about adding the

content-type = application/word 

header and that's not an option because we need the images embedded in the document so that it can be viewed without an active internet connection.

I need either a free (preferably) or commercial .NET library or a command line utility as I need to do this on a hosted ASP.NET application on a shared server :|.

回答1:

If you are using Word 2003 or 2007 you can convert xhtml documents to Word Xml documents using xslt. If you google for html to docx xsl you will find many examples of the opposite (converting docx to html) so you might one of those examples as a basis for a conversion. The only challenge would be downloading and embedding the images in the document, but that is also possible.



回答2:

There are several possibilities for converting HTML to RTF. These links should get you started:

  • DocFrac, conversion between HTML, RTF and text. Free, runs on Windows.
  • XHTML2RTF: An HTML to RTF conversion tool based on XSL
  • Writing an RTF to HTML converter

Converting to MS Word .doc is much harder and probably not worthwhile for you. For the reasons this is such a pain, read Joel's interesting article on .doc. If you have to write .doc for some reason, COM interop with MSOffice is probably your best bet.