I'm looking for a solution capable of producing a result as close to the original as possible, such as this example, which converts .doc rather than .ppt files. It should, ideally, take in a FileInputStream
and output a String
with the desired html code.
I've come across this question, which posts code very similar to that on the Apache POI website, however it converts to an image and I've been unable to re-purpose it.
Otherwise, there seems to be next to no code out there to do this.
EDIT:
I've tried implementing the Apache Tika solution, however I'm having trouble with the parser. I've seen that several people have had this issue when implementing the library on Android, however I haven't seen anyone suggest a solution.
My code is as follows:
HSLFSlideShow powerpointDoc = new HSLFSlideShow(inputDocument);
inputDocument.close();
List<HSLFSlide> slides = powerpointDoc.getSlides();
ContentHandler handler = new ToXMLContentHandler();
AutoDetectParser parser = new AutoDetectParser();
Metadata metadata = new Metadata();
for (int i = 0; i <= slides.size(); i++) {
parser.parse(inputDocument, handler, metadata);
}
String result = handler.toString();
Could anyone provide an example of how I might use Apache Tika?