How to ignore missing glyphs in font used by PDFBo

2019-08-16 16:21发布

问题:

I'm seeing "java.lang.IllegalArgumentException: No glyph for U+05D0 in font" (as an example) exception being thrown when calling the showText(String) method of PDFPageContentStream.

Catching the exception isn't very helpful because good characters won't get written. Neither is checking each character in the input string, which would be a performance killer (each PDF could be thousands of pages, millions of characters). What I really need is a way to prevent the exception for ANY missing glyph and have it automatically replaced with some other glyph, or a dynamically created glyph that shows the unicode value.

I don't want to stop producing the PDF because a font doesn't support a particular glyph, I just want some replacement character to be used instead and keep going.

How to achieve this?

回答1:

This is what I did

private final char[] replacements = IntStream.range(0, 1<<16)
    .map(c -> canRender(font, c) ? c : "?")
    .collect(StringBuilder::new, StringBuilder::appendCodePoint,
             StringBuilder::append)
    .toString().toCharArray();

// This is extremely ugly!!!
private boolean canRender(PDType0Font font, int codepoint) {
    try {
        font.getStringWidth(new String(Character.toChars(codepoint)));
        return true;
    } catch (final Exception e) {
        return false;
    }
}

String sanitize(String input) {
    return input.codePoints()
            .map(c -> c<replacements.length ? replacements[c] : '?')
            .collect(StringBuilder::new, StringBuilder::appendCodePoint,
                     StringBuilder::append)
            .toString();

I don't think, it's worth optimizing as during the PDF generation, more work has to be done, including the hasGlyph tests.