retain newline in word file generation using apach

2019-07-28 06:12发布

问题:

I am trying to use apache POI to dynamically generate a word file by collecting some data in an arraylist and then printing it in the console output as well as the word file. I am able to get the output in console as well as the word file, but inside each arraylist element I have added a new line character at the end so that the array elements are printed linewise. In the console output the new line character works i.e. the arraylist elements come linewise but in the generated word file the line break is missing.How can I retain the line breaks in the generated word file and remove the comma at the end of the array elements. NOTE: the arraylist is 'result' and "isLinkBroken(new URL(element.getAttribute("href")))" is a function that returns some value.The concerned code snippet is given below :

protected void doPost(HttpServletRequest request,HttpServletResponse response)throws ServletException,IOException {
   String url= request.getParameter("url");
   System.setProperty("webdriver.chrome.driver", "H:\\suraj\\sftwr\\chromedriver_win32\\chromedriver.exe");
   ChromeDriver ff = new ChromeDriver();
   ff.get("http://"+url);
   ArrayList result = new ArrayList();        
   List<WebElement> allImages = findAllLinks(ff);   
   int i=0;
   System.out.println("Total number of elements found " + allImages.size());
   for( WebElement element : allImages){
      try {            
         if(!isLinkBroken(new URL(element.getAttribute("href"))).equals("OK")) {
            i++;
            System.out.println("inside"+i);
            System.out.println("URL: " + element.getAttribute("href")+ " returned " + isLinkBroken(new URL(element.getAttribute("href"))));
            result.add(i+"  URL: " + element.getAttribute("href")+ " returned " + isLinkBroken(new URL(element.getAttribute("href")))+"\n");
         }
      }
      catch(Exception exp) {
         System.out.println("outside");
         System.out.println("At " + element.getAttribute("innerHTML") + " Exception occured -&gt; " + exp.getMessage());                
      }
   }
   System.out.println("OUTPUT");
   System.out.println(result.toString());
   FileOutputStream outStream=new FileOutputStream("H:\\suraj\\InactiveURL\\test.docx");
   XWPFDocument doc=new XWPFDocument();
   XWPFParagraph para = doc.createParagraph();
   para.setAlignment(ParagraphAlignment.LEFT);
   XWPFRun pararun=para.createRun();
   pararun.setText(result.toString());
   doc.write(outStream);
   outStream.close();
}    

回答1:

The Word .docx format doesn't encode Newlines (nor other whitespace breaks like tabs) as their native ascii representations. Instead, you need to use additional XML tags for those

If you look at the JavaDocs for XWPFRun, you'll see all the whitespace break options, such as XWPFRun.addTab() and XWPFRun.addCarriageReturn()

There's a good example in the XWPF examples which you should read through. Basically though, to take the text

This is line one
This is line two

And encode that into .docx using XWPF, you should do something like

XWPFParagraph p1 = doc.createParagraph();
XWPFRun r1 = p1.createRun();

r1.setText("This is line one");
r1.addCarriageReturn();
r1.setText("This is line two");

If you're starting from a block of text, you should split that on newlines. Next, add each split line with a separate run.setText call, and do a run.addCarriageReturn between each



回答2:

If you think in Word terms, when you hit the enter key you are really adding a new paragraph. If you want a break between lines you should be adding a new paragraph for each element in the array rather than trying to keep everything in a single paragraph.

Here are some modifications to your code:

protected void doPost(HttpServletRequest request,HttpServletResponse response)throws ServletException,IOException {
   String url= request.getParameter("url");
   System.setProperty("webdriver.chrome.driver", "H:\\suraj\\sftwr\\chromedriver_win32\\chromedriver.exe");
   ChromeDriver ff = new ChromeDriver();
   ff.get("http://"+url);
   ArrayList<String> result = new ArrayList<String>();        
   List<WebElement> allImages = findAllLinks(ff);   
   int i=0;
   System.out.println("Total number of elements found " + allImages.size());
   for( WebElement element : allImages){
      try {            
         if(!isLinkBroken(new URL(element.getAttribute("href"))).equals("OK")) {
            i++;
            System.out.println("inside"+i);
            System.out.println("URL: " + element.getAttribute("href")+ " returned " + isLinkBroken(new URL(element.getAttribute("href"))));
            result.add(i+"  URL: " + element.getAttribute("href")+ " returned " + isLinkBroken(new URL(element.getAttribute("href"))));
         }
      }
      catch(Exception exp) {
         System.out.println("outside");
         System.out.println("At " + element.getAttribute("innerHTML") + " Exception occured -&gt; " + exp.getMessage());                
      }
   }
   System.out.println("OUTPUT");
   System.out.println(result.toString());
   FileOutputStream outStream=new FileOutputStream("H:\\suraj\\InactiveURL\\test.docx");
   XWPFDocument doc=new XWPFDocument();
   for (String elem : result) {
      XWPFParagraph para = doc.createParagraph();
      XWPFRun pararun=para.createRun();
      pararun.setText(elem);
   }
   doc.write(outStream);
   outStream.close();
}  

Note: I removed the newline character from your string, and added a Generic to your array list. These should not change your output (except on the console). The real change is to put the create paragraph in a loop. That should add however many paragraphs you need to the document.



回答3:

Rather setting the list object to setText method, you should iterate through the list and create content using string builder.

Here is the sample code:

XWPFRun pararun = para.createRun();
    StringBuilder content = new StringBuilder();
    for (int j = 0; j < result.size(); j++) {
        content.append(result.get(j));
    }
    pararun.setText(content.toString());
    doc.write(outStream);