I got a problem when I try to use Apache POI project to convert my PPT
to Images. My code as follows:
FileInputStream is = new FileInputStream("test.ppt");
SlideShow ppt = new SlideShow(is);
is.close();
Dimension pgsize = ppt.getPageSize();
Slide[] slide = ppt.getSlides();
for (int i = 0; i < slide.length; i++) {
BufferedImage img = new BufferedImage(pgsize.width, pgsize.height,
BufferedImage.TYPE_INT_RGB);
Graphics2D graphics = img.createGraphics();
//clear the drawing area
graphics.setPaint(Color.white);
graphics.fill(new Rectangle2D.Float(0, 0, pgsize.width, pgsize.height));
//render
slide[i].draw(graphics);
//save the output
FileOutputStream out = new FileOutputStream("slide-" + (i+1) + ".png");
javax.imageio.ImageIO.write(img, "png", out);
out.close();
It works fine except that all Chinese words are converted to some squares. Then how can I fix this?
This seems to be a bug with apache POI. I have added it in bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=54880
The problem is not on the POI side, but in the JVM font setting.
You need to set the font to one in the list of JVM fonts (/usr/lib/jvm/jdk1.8.0_20/jre/lib/fonts
or similar), such as simsun.ttc.
XSLFTextShape[] phs = slide[i].getPlaceholders();
for (XSLFTextShape ts : phs) {
java.util.List<XSLFTextParagraph> tpl = ts.getTextParagraphs();
for(XSLFTextParagraph tp: tpl) {
java.util.List<XSLFTextRun> trs = tp.getTextRuns();
for(XSLFTextRun tr: trs) {
logger.info(tr.getFontFamily());
tr.setFontFamily("SimSun");
}
}
}
The issue is usage of FileOuputStream which will always write data to the file in default system encoding which is most probably ISO-8859_1 for Windows. Chinese characters are not supported by this encoding. You need to create a stream where you can write using UTF-8 encoding which needs creation of reader. I was looking at the API but did not find any methods taking reader as an argument. But check if ImageOutputStream can help you.