I am following the documentation on apple.com.
I managed to get The 'cmap' encoding subtables
. I know 100% that platformID, platformSpecificID
are correct, but offset
is suspicious. Here is the data:
array(3) {
[0]=>
array(3) {
["platform_id"]=>
int(0)
["specific_id"]=>
int(3)
["offset"]=>
int(532)
}
[1]=>
array(3) {
["platform_id"]=>
int(1)
["specific_id"]=>
int(0)
["offset"]=>
int(28)
}
[2]=>
array(3) {
["platform_id"]=>
int(3)
["specific_id"]=>
int(1)
["offset"]=>
int(532)
}
}
Offset for two tables is the same, 532
. Can anyone explain me this? And is this offset from current position or from the beginning of the file?
part 2
Ok. So I managed to get to the format
tables using this:
private function parseCmapTable($table)
{
$this->position = $table['offset'];
// http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html
// General table information
$data = array
(
'version' => $this->getUint16(),
'number_subtables' => $this->getUint16(),
);
$sub_tables = array();
for($i = 0; $i < $data['number_subtables']; $i++)
{
// http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html
// The 'cmap' encoding subtables
$sub_tables[] = array
(
'platform_id' => $this->getUint16(),
'specific_id' => $this->getUint16(),
'offset' => $this->getUint32(),
);
}
// http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html
// The 'cmap' formats
$formats = array();
foreach($sub_tables as $t)
{
// http://stackoverflow.com/questions/5322019/character-to-glyph-mapping-table/5322267#5322267
$this->position = $table['offset'] + $t['offset'];
$format = array
(
'format' => $this->getUint16(),
'length' => $this->getUint16(),
'language' => $this->getUint16(),
);
if($format['format'] == 4)
{
$format += array
(
'seg_count_X2' => $this->getUint16(),
'search_range' => $this->getUint16(),
'entry_selector' => $this->getUint16(),
'range_shift' => $this->getUint16(),
'end_code[segCount]' => $this->getUint16(),
'reserved_pad' => $this->getUint16(),
'start_code[segCount]' => $this->getUint16(),
'id_delta[segCount]' => $this->getUint16(),
'id_range_offset[segCount]' => $this->getUint16(),
'glyph_index_array[variable]' => $this->getUint16(),
);
$backup = $format;
$format['seg_count_X2'] = $backup['seg_count_X2']*2;
$format['search_range'] = 2 * (2 * floor(log($backup['seg_count_X2'], 2)));
$format['entry_selector'] = log($backup['search_range']/2, 2);
$format['range_shift'] = (2 * $backup['seg_count_X2']) - $backup['search_range'];
}
$formats[$t['offset']] = $format;
}
die(var_dump( $sub_tables, $formats ));
The output:
array(3) {
[0]=>
array(3) {
["platform_id"]=>
int(0)
["specific_id"]=>
int(3)
["offset"]=>
int(532)
}
[1]=>
array(3) {
["platform_id"]=>
int(1)
["specific_id"]=>
int(0)
["offset"]=>
int(28)
}
[2]=>
array(3) {
["platform_id"]=>
int(3)
["specific_id"]=>
int(1)
["offset"]=>
int(532)
}
}
array(2) {
[532]=>
array(13) {
["format"]=>
int(4)
["length"]=>
int(658)
["language"]=>
int(0)
["seg_count_X2"]=>
int(192)
["search_range"]=>
float(24)
["entry_selector"]=>
float(5)
["range_shift"]=>
int(128)
["end_code[segCount]"]=>
int(48)
["reserved_pad"]=>
int(58)
["start_code[segCount]"]=>
int(64)
["id_delta[segCount]"]=>
int(69)
["id_range_offset[segCount]"]=>
int(70)
["glyph_index_array[variable]"]=>
int(90)
}
[28]=>
array(3) {
["format"]=>
int(6)
["length"]=>
int(504)
["language"]=>
int(0)
}
}
Now, how do I get from here, to getting character Unicode codes? I tried reading the documentation, but it is too vague for a novice.
http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html