Character to Glyph mapping table

2019-02-19 14:10发布

问题:

I am following the documentation on apple.com.

I managed to get The 'cmap' encoding subtables. I know 100% that platformID, platformSpecificID are correct, but offset is suspicious. Here is the data:

array(3) {
  [0]=>
  array(3) {
    ["platform_id"]=>
    int(0)
    ["specific_id"]=>
    int(3)
    ["offset"]=>
    int(532)
  }
  [1]=>
  array(3) {
    ["platform_id"]=>
    int(1)
    ["specific_id"]=>
    int(0)
    ["offset"]=>
    int(28)
  }
  [2]=>
  array(3) {
    ["platform_id"]=>
    int(3)
    ["specific_id"]=>
    int(1)
    ["offset"]=>
    int(532)
  }
}

Offset for two tables is the same, 532. Can anyone explain me this? And is this offset from current position or from the beginning of the file?

part 2

Ok. So I managed to get to the format tables using this:

private function parseCmapTable($table)
{
    $this->position         = $table['offset'];

    // http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html
    // General table information

    $data   = array
    (
        'version'           => $this->getUint16(),
        'number_subtables'  => $this->getUint16(),
    );

    $sub_tables = array();

    for($i = 0; $i < $data['number_subtables']; $i++)
    {

        // http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html
        // The 'cmap' encoding subtables

        $sub_tables[]   = array
        (
            'platform_id'       => $this->getUint16(),
            'specific_id'       => $this->getUint16(),
            'offset'            => $this->getUint32(),
        );

    }

    // http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html
    // The 'cmap' formats

    $formats                = array();

    foreach($sub_tables as $t)
    {
        // http://stackoverflow.com/questions/5322019/character-to-glyph-mapping-table/5322267#5322267

        $this->position = $table['offset'] + $t['offset'];

        $format = array
        (
            'format'                    => $this->getUint16(),
            'length'                    => $this->getUint16(),
            'language'                  => $this->getUint16(),
        );

        if($format['format'] == 4)
        {
            $format     += array
            (
                'seg_count_X2'                  => $this->getUint16(),
                'search_range'                  => $this->getUint16(),
                'entry_selector'                => $this->getUint16(),
                'range_shift'                   => $this->getUint16(),
                'end_code[segCount]'            => $this->getUint16(),
                'reserved_pad'                  => $this->getUint16(),
                'start_code[segCount]'          => $this->getUint16(),
                'id_delta[segCount]'            => $this->getUint16(),
                'id_range_offset[segCount]'     => $this->getUint16(),
                'glyph_index_array[variable]'   => $this->getUint16(),
            );

            $backup = $format;

            $format['seg_count_X2']     = $backup['seg_count_X2']*2;
            $format['search_range']     = 2 * (2 * floor(log($backup['seg_count_X2'], 2)));
            $format['entry_selector']   = log($backup['search_range']/2, 2);
            $format['range_shift']      = (2 * $backup['seg_count_X2']) - $backup['search_range'];
        }

        $formats[$t['offset']]  = $format;
    }       

    die(var_dump( $sub_tables, $formats ));

The output:

array(3) {
[0]=>
  array(3) {
    ["platform_id"]=>
    int(0)
    ["specific_id"]=>
    int(3)
    ["offset"]=>
    int(532)
  }
  [1]=>
  array(3) {
    ["platform_id"]=>
    int(1)
    ["specific_id"]=>
    int(0)
    ["offset"]=>
    int(28)
  }
  [2]=>
  array(3) {
    ["platform_id"]=>
    int(3)
    ["specific_id"]=>
    int(1)
    ["offset"]=>
    int(532)
  }
}
array(2) {
  [532]=>
  array(13) {
    ["format"]=>
    int(4)
    ["length"]=>
    int(658)
    ["language"]=>
    int(0)
    ["seg_count_X2"]=>
    int(192)
    ["search_range"]=>
    float(24)
    ["entry_selector"]=>
    float(5)
    ["range_shift"]=>
    int(128)
    ["end_code[segCount]"]=>
    int(48)
    ["reserved_pad"]=>
    int(58)
    ["start_code[segCount]"]=>
    int(64)
    ["id_delta[segCount]"]=>
    int(69)
    ["id_range_offset[segCount]"]=>
    int(70)
    ["glyph_index_array[variable]"]=>
    int(90)
  }
  [28]=>
  array(3) {
    ["format"]=>
    int(6)
    ["length"]=>
    int(504)
    ["language"]=>
    int(0)
  }
}

Now, how do I get from here, to getting character Unicode codes? I tried reading the documentation, but it is too vague for a novice.

http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html

回答1:

The offset is from the beginning of the table. What your data is saying is that the Mac table (platformId 1) starts at offset 28, while the Unicode (platformId 0) and Windows (platformId 3) mappings share the same table that starts at byte offset 532.