How to convert RichText (RTF) document with images

2020-07-18 11:59发布

问题:

I have been trying to find a free (preferably open sourced) component or library which will allow to convert a RTF file with embedded images into HTML file and image files or better HTML and image streams.

The perfect solution, regardless if it is a DLL library or Delphi component, would allow to stream data to IStream/TStream using callbacks, so I will be able to convert and save images into a format of choice returning image file relative name for RTF parser to include in generated HTML file, yet saving as-is is also good especially when code would be open sourced.

I have came across commercial solutions yet I struggle to consider them because prices for a (relatively) simple conversion of one document type into another are quite high and both formats are 20 years old which suggests there must be existing library (native, not managed) to make such conversion.

If I won't find a solution, I will probably convert this code into Delphi dll and make it available, but maybe someone already did it?

EDIT:

We've decided to use aforementioned .Net RtfConverter compiled as a DLL, generate Delphi TLB unit from it and force customers to install .Net framework (embedded in installer). Now conversion works like a charm, another sign it's time to move on to .Net from Delphi...

回答1:

If you COULD use microsoft office to open the RTF and then save it as HTML in the background, then I believe this is your best solution, just fire a Microsoft Word instance in the background using OLE, load the RTF and then export it as HTML...



回答2:

A commercial converter for RTF to HTML 4.01 / HTML5 and RTF to various flavors of XHTML is ScroogeXHTML for Delphi. Version 5.0 included improved picture support, with example code for WMF to PNG conversion. (I am the developer of this component and its counterpart for the Java platform).



回答3:

P.S: I'm a developer of this product.

This is commercial .Net library to convert RTF to HTML 3.2, 4.01, XHTML 1.01 and HTML 5. It

supports converting with tables and nested tables, ordered and bulleted lists, images embeded in HTML, Unicode, special HTML symbols etc.

This is a sample code in C#:

        SautinSoft.RtfToHtml r = new SautinSoft.RtfToHtml();
        r.OutputFormat = SautinSoft.RtfToHtml.eOutputFormat.HTML_5;
        r.ImageStyle.IncludeImageInHtml = true;
        r.ConvertFile(@"d:\document.rtf",@"d:\html5.htm");