Convert HTML page containing Arabic characters to

2019-04-24 22:59发布

问题:

I want to convert an HTML page that contains Arabic characters to a PDF file using FlyingSaucer, but the generated PDF does not contain combined characters and prints the output backwards.

HTML:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    </head>

    <body style="font-size:15px;font-family: Arial Unicode MS;">

        <center  style="font-size: 18px; font-family: Arial Unicode MS;">
            <b>
                <i style="font-family: Arial Unicode MS;">
                    &#x062C;&#x0645;&#x064A;&#x0639; &#x0627;&#x0644;&#x062D;&#x0642;&#x0648;&#x0642;<br />
                </i>
            </b>
        </center>
    </body>
</html>

Java Excerpt:

String inputFile = "c:\\html.html";
        String url = new File(inputFile).toURI().toURL().toString();
        String outputFile = "c:\\html.pdf";
        OutputStream os = new FileOutputStream(outputFile);

        ITextRenderer renderer = new ITextRenderer();
        renderer.getFontResolver().addFont("c://ARIALUNI.TTF", BaseFont.IDENTITY_H,BaseFont.EMBEDDED);

        renderer.setDocument(url);
        renderer.layout();
        renderer.createPDF(os);
        os.close();

Actual PDF Result:

Expected PDF Result:

What can I do to obtain the right result?

回答1:

While I was working with Arabic font, I faced similar alignment issue. Arabic is an RTL Language. You need specific jars to generate PDFs in an RTL Language. Currently when you are trying to generate PDF, mode is normal LTR because of which you are getting current output.



回答2:

Yes it related to RTL but if you have no choice related to fonts then you can use Arial fonts which has all characters required by you. follow this link https://stackoverflow.com/a/47801584/3335776 to see code.

Some how issue is with flying saucer default fonts

you can find Complete article Here