Finding out what characters a font supports

2019-01-08 18:08发布

How do I extract the list of supported Unicode characters from a TrueType or embedded OpenType font on Linux?

Is there a tool or a library I can use to process a .ttf or a .eot file and build a list of code points (like U+0123, U+1234, etc.) provided by the font?

8条回答
对你真心纯属浪费
2楼-- · 2019-01-08 18:23

I just had the same problem, and made a HOWTO that goes one step further, baking a regexp of all the supported Unicode code points.

If you just want the array of codepoints, you can use this when peeking at your ttx xml in Chrome devtools, after running ttx -t cmap myfont.ttf and, probably, renaming myfont.ttx to myfont.xml to invoke Chrome's xml mode:

function codepoint(node) { return Number(node.nodeValue); }
$x('//cmap/*[@platformID="0"]/*/@code').map(codepoint);

(Also relies on fonttools from gilamesh's suggestion; sudo apt-get install fonttools if you're on an ubuntu system.)

查看更多
Explosion°爆炸
3楼-- · 2019-01-08 18:23

You can do this on Linux in Perl using the Font::TTF module.

查看更多
聊天终结者
4楼-- · 2019-01-08 18:32

The Linux program xfd can do this. It's provided in my distro as 'xorg-xfd'. To see all characters for a font, you can run this in terminal:

xfd -fa "DejaVu Sans Mono"
查看更多
我欲成王,谁敢阻挡
5楼-- · 2019-01-08 18:34

The character code points for a ttf/otf font are stored in the CMAP table.

You can use ttx to generate a XML representation of the CMAP table. see here.

You can run the command ttx.exe -t cmap MyFont.ttf and it should output a file MyFont.ttx. Open it in a text editor and it should show you all the character code it found in the font.

查看更多
啃猪蹄的小仙女
6楼-- · 2019-01-08 18:36

Here is a method using the FontTools module (which you can install with something like pip install fonttools):

#!/usr/bin/env python
from itertools import chain
import sys

from fontTools.ttLib import TTFont
from fontTools.unicode import Unicode

ttf = TTFont(sys.argv[1], 0, verbose=0, allowVID=0,
                ignoreDecompileErrors=True,
                fontNumber=-1)

chars = chain.from_iterable([y + (Unicode[y[0]],) for y in x.cmap.items()] for x in ttf["cmap"].tables)
print(list(chars))

# Use this for just checking if the font contains the codepoint given as
# second argument:
#char = int(sys.argv[2], 0)
#print(Unicode[char])
#print(char in (x[0] for x in chars))

ttf.close()

The script takes as argument the font path :

python checkfont.py /path/to/font.ttf
查看更多
Melony?
7楼-- · 2019-01-08 18:36

If you ONLY want to "view" the fonts, the following might be helpful (if your terminal supports the font in question):

#!/usr/bin/env python
import sys
from fontTools.ttLib import TTFont

with TTFont(sys.argv[1], 0, ignoreDecompileErrors=True) as ttf:
    for x in ttf["cmap"].tables:
        for (_, code) in x.cmap.items():
            point = code.replace('uni', '\\u').lower()
            print("echo -e '" + point + "'")

An unsafe, but easy way to view:

python font.py my-font.ttf | sh

Thanks to Janus (https://stackoverflow.com/a/19438403/431528) for the answer above.

查看更多
登录 后发表回答