Accessing OpenType glyph variants in iText

2020-05-20 02:34发布

问题:

When building PDF documents with OpenType fonts in iText, I want to access glyph variants from within the font -- specifically tabular figures. Since OpenType glyph variants do not have Unicode indices, I am not sure how to either specify that I want to use a particular set of variants (tabular figures) or call a specific glyph by its glyph ID. Just looking for the relevant iText class name if one exists.

回答1:

This does not seem to be possible neither in the latest tag 5.5.8, nor in the master branch of iText.

As explained in this article and in the Microsoft's OpenType font file specification, glyph variants are stored in the Glyph Substitution Table (GSUB) of a font file. Accessing the glyph variants requires reading this table from the file, which is actually implemented in the class com.itextpdf.text.pdf.fonts.otf.GlyphSubstitutionTableReader, though this class is disabled for now.

The call readGsubTable() in the class com.itextpdf.text.pdf.TrueTypeFontUnicode is commented out.

void process(byte ttfAfm[], boolean preload) throws DocumentException, IOException {
    super.process(ttfAfm, preload);
    //readGsubTable();
}

It turns out that this line is disabled for a reason, as the code actually does not work if you try to activate it.

So, unfortunately, there is no way to use glyph variants, as the substitution information is never loaded from the font file.

Update

The original answer was about possibility to use iText API for accessing glyph variants out of the box, which is not there yet. However, the low level code is in place and can be used after some hacking to access the glyph substitution mapping table.

When called read(), the GlyphSubstitutionTableReader reads the GSUB table and flattens substitutions of all features into one map Map<Integer, List<Integer>> rawLigatureSubstitutionMap. The symbolic names of the features are currently discarded by OpenTypeFontTableReader. The rawLigatureSubstitutionMap maps a glyphId variant to a base glyphId, or a ligature glyphId to a sequence of glyphIds like this:

629 -> 66 // a.feature -> a
715 -> 71, 71, 77 // ffl ligature

This mapping can be reversed to get all variants for a base glyphId. So all extended glyphs with unknown unicode values can be figured out through their connection to a base glyph, or a sequence of glyphs.

Next, to be able to write a glyph to PDF, we need to know a unicode value for that glyphId. A relationship unicode -> glyphId is mapped by a cmap31 field in TrueTypeFont. Reversing the map gives unicode by glyphId.

Tweaking

rawLigatureSubstitutionMap cannot be accessed in GlyphSubstitutionTableReader, as it's a private member and does not have a getter accessor. The simplest hack would be to copy-paste the original class and add a getter for the map:

public class HackedGlyphSubstitutionTableReader extends OpenTypeFontTableReader {

    // copy-pasted code ...

    public Map<Integer, List<Integer>> getRawSubstitutionMap() {
        return rawLigatureSubstitutionMap;
    }
}

Next problem is that GlyphSubstitutionTableReader needs an offset for GSUB table, information that is stored in protected HashMap<String, int[]> tables of TrueTypeFont class. A helper class placed into same package will bridge access to the protected members of TrueTypeFont.

package com.itextpdf.text.pdf;

import com.itextpdf.text.pdf.fonts.otf.FontReadingException;
import java.io.IOException;
import java.util.List;
import java.util.Map;

public class GsubHelper {
    private Map<Integer, List<Integer>> rawSubstitutionMap;

    public GsubHelper(TrueTypeFont font) {
        // get tables offsets from the font instance
        Map<String, int[]> tables = font.tables;
        if (tables.get("GSUB") != null) {
            HackedGlyphSubstitutionTableReader gsubReader;
            try {
                gsubReader = new HackedGlyphSubstitutionTableReader(
                        font.rf, tables.get("GSUB")[0], glyphToCharacterMap, font.glyphWidthsByIndex);
                gsubReader.read();
            } catch (IOException | FontReadingException e) {
                throw new IllegalStateException(e.getMessage());
            }
            rawSubstitutionMap = gsubReader.getRawSubstitutionMap();
        }
    }

    /** Returns a glyphId substitution map
     */
    public Map<Integer, List<Integer>> getRawSubstitutionMap() {
        return rawSubstitutionMap;
    }
}

It would be nicer to extend TrueTypeFont, but that would not work with factory methods createFont() of BaseFont, which relies on hard coded class names when creating a font.