extracting from html string and generating

2019-08-17 13:25发布

I am trying to extract table tags(html) from a string and output them as table on pdf which I download on my local.

As the string which contains the html content is dynamic, so I can't do cell by cell or row by row mapping.

For eg.

private String message = "<html><body><p class=\"MsoNormal\"><b><span style=\"color: rgb(68, 84, 106);\">Dear Agent,<br><br>Please be informed that because no TRMF or reason for delay were received by the due date mentioned below, we consider the Transaction to be Paid in Error. We are going to act accordingly which means charging the Paying Account in case we are not able to defend legal dispute without TRMF.</span></b><span style=\"font-size: 10pt; line-height: 14.2667px;\"><o:p></o:p></span></p><p class=\"MsoNormal\"><span style=\"font-size: 10pt; line-height: 14.2667px;\">&nbsp;</span></p><div><span style=\"font-size: 10pt; line-height: 14.2667px;\"><br></span></div><table class=\"MsoNormalTable\" border=\"0\" cellspacing=\"0\" cellpadding=\"0\" width=\"0\" style=\"width: 472.9pt; margin-left: 5.9pt;border-collapse: collapse;\"><tr><td>Neeraj</td><td>Chand</td></tr><tr><td>Sowmya</td><td>Javvadi</td></tr></table></body></html>";

I will be receiving such string which will hold the html content. I have to generate the pdf file corresponding to such content. The input string might or might not have any table content.

I tried below but it doesn't work and I receive error that "table width can't be 0".

public StreamedContent getFile() throws IOException, DocumentException {
        final PortletResponse portletResponse = (PortletResponse) FacesContext.getCurrentInstance().getExternalContext()
                .getResponse();
        final HttpServletResponse res = PortalUtil.getHttpServletResponse(portletResponse);
        res.setContentType("application/pdf");
        res.setHeader("Cache-Control", "no-store, no-cache, must-revalidate");
        // res.setHeader("Content-Disposition", "attachment; filename=\".pdf\"");
        res.setHeader("Content-Disposition", "attachment; filename=" + subject + ".pdf");
        res.setHeader("Refresh", "1");
        res.flushBuffer();
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        OutputStream out = res.getOutputStream();
        Document document = new Document(PageSize.LETTER);
        PdfWriter.getInstance(document, baos);
        document.open();
        document.addCreationDate();
        /* without parsing html, it works and generates pdf
        Table table = new Table(2, 2);
        document.add(new Paragraph("converted to PdfPTable:"));
        table.setConvert2pdfptable(true);
        document.add(table);
         */

        //below doesn't work
        HTMLWorker htmlWorker = new HTMLWorker(document);
        String str = this.getMessage();
        htmlWorker.parse(new StringReader(str));
        PdfPTable table= new PdfPTable(2); // not sure what to give here as nummber of columns is dynamic
        table.setTotalWidth(document.getPageSize().getWidth() - 80);
        document.add(table);
        document.close();
        baos.writeTo(out);
        out.flush();
        out.close();
        return null;
    }

Is there a way I can generate pdf from any html string provided? Or if there is any other tool which I can use for this please let me know.

0条回答
登录 后发表回答
向帮助了您的知道网友说句感谢的话吧!