Suppose I have two .docx files, input.docx
and output.docx
I need to select some of the content in input.docx
and copy them to output.docx
. The newdoc
displays its content in the console seems correct, but I did not get anything in the output.docx
, except blank lines. Can anyone provide advices?
InputStream is = new FileInputStream("D:\\input.docx");
XWPFDocument doc = new XWPFDocument(is);
List<XWPFParagraph> paras = doc.getParagraphs();
List<XWPFRun> runs;
XWPFDocument newdoc = new XWPFDocument();
for (XWPFParagraph para : paras) {
runs = para.getRuns();
if(!para.isEmpty())
{
XWPFParagraph newpara = newdoc.createParagraph();
XWPFRun newrun = newpara.createRun();
for (int i=0; i<runs.size(); i++) {
newrun=runs.get(i);
newpara.addRun(newrun);
}
}
}
List<XWPFParagraph> newparas = newdoc.getParagraphs();
for (XWPFParagraph para1 : newparas) {
System.out.println(para1.getParagraphText());
}// in the console, I have the correct information
FileOutputStream fos = new FileOutputStream(new File("D:\\output.docx"));
newdoc.write(fos);
fos.flush();
fos.close();
I slightly modified your code, it copies text without changing text format.
public static void main(String[] args) {
try {
InputStream is = new FileInputStream("Japan.docx");
XWPFDocument doc = new XWPFDocument(is);
List<XWPFParagraph> paras = doc.getParagraphs();
XWPFDocument newdoc = new XWPFDocument();
for (XWPFParagraph para : paras) {
if (!para.getParagraphText().isEmpty()) {
XWPFParagraph newpara = newdoc.createParagraph();
copyAllRunsToAnotherParagraph(para, newpara);
}
}
FileOutputStream fos = new FileOutputStream(new File("newJapan.docx"));
newdoc.write(fos);
fos.flush();
fos.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
// Copy all runs from one paragraph to another, keeping the style unchanged
private static void copyAllRunsToAnotherParagraph(XWPFParagraph oldPar, XWPFParagraph newPar) {
final int DEFAULT_FONT_SIZE = 10;
for (XWPFRun run : oldPar.getRuns()) {
String textInRun = run.getText(0);
if (textInRun == null || textInRun.isEmpty()) {
continue;
}
int fontSize = run.getFontSize();
System.out.println("run text = '" + textInRun + "' , fontSize = " + fontSize);
XWPFRun newRun = newPar.createRun();
// Copy text
newRun.setText(textInRun);
// Apply the same style
newRun.setFontSize( ( fontSize == -1) ? DEFAULT_FONT_SIZE : run.getFontSize() );
newRun.setFontFamily( run.getFontFamily() );
newRun.setBold( run.isBold() );
newRun.setItalic( run.isItalic() );
newRun.setStrike( run.isStrike() );
newRun.setColor( run.getColor() );
}
}
There's still a little problem with fontSize. Sometimes POI can't determine the size of a run (i write its value to console to trace it) and gives -1. It defines perfectly the size of font when i set it myself (say, i select some paragraphs in Word and set its font manually, either size or font family). But when it treats another POI-generated text, it sometimes gives -1. So i intriduce a default font size (10 in the above example) to be set when POI gives -1.
Another issue seems to emerge with Calibri font family. But in my tests, POI sets it to Arial by default, so i don't have the same trick with default fontFamily, as it was for fontSize.
Other font properties (Bold, italic, etc.) work well.
Probably, all these font problems are due to the fact that in my tests text was copied from .doc file. If you have .doc as input, open .doc file in Word, then "Save as.." and choose .docx format. Then in your program use only XWPFDocument
instead of HWPFDocument
, and i suppose it will be okay.