Encoding of umlauts with the android pdf writer li

2019-06-09 14:43发布

问题:

I'm using the Android PDF Writer and I'm still confused. My PDF Renderer have to write words with umlauts ('ß', 'Ä', 'Ü') and they doesn't appear correctly in the pdf.

I think the issue is with the method getBytes(String encoding) from the String class.

PDFWriter pdfWriter = new PDFWriter(PaperSize.A4_WIDTH, PaperSize.A4_HEIGHT);
pdfWriter.setFont(StandardFonts.SUBTYPE, StandardFonts.SANS_SERIF, StandardFonts.MAC_ROMAN_ENCODING);
// only write some strings into the pdfwriter
parseData(pdfWriter);
outputToFile(filename, pdfWriter.asString(), "UTF-8");

When I inspect the pdfWriter.asString() the umlauts are present.

private void outputToFile(String fileName, String pdfContent, String encoding) {
    File newFile = new File(fileName);
    Log.v(Constants.LOG_TAG, newFile.getAbsolutePath());
    try {
        newFile.createNewFile();
        try {
            FileOutputStream pdfFile = new FileOutputStream(newFile);
            pdfFile.write(pdfContent.getBytes("UTF8"));
            pdfFile.close();
        } catch(FileNotFoundException e) {
            //
        }
    } catch(IOException e) {
        //
    }
}

Maybe there is a problem within the getBytes() method?

回答1:

You have the answer before your very nose: Your PDF doesn't use UTF-8, so the PDF viewer tries to decode your UTF-8 encoded file as MacRoman.

For a quick fix you can use StandardEncodings.WIN_ANSI_ENCODING on one side and "WINDOWS-1252" or "ISO-8859-1" on the other.