可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
How do I extract the list of supported Unicode characters from a TrueType or embedded OpenType font on Linux?
Is there a tool or a library I can use to process a .ttf or a .eot file and build a list of code points (like U+0123, U+1234, etc.) provided by the font?
回答1:
Here is a method using the FontTools module (which you can install with something like pip install fonttools
):
#!/usr/bin/env python
from itertools import chain
import sys
from fontTools.ttLib import TTFont
from fontTools.unicode import Unicode
ttf = TTFont(sys.argv[1], 0, verbose=0, allowVID=0,
ignoreDecompileErrors=True,
fontNumber=-1)
chars = chain.from_iterable([y + (Unicode[y[0]],) for y in x.cmap.items()] for x in ttf["cmap"].tables)
print(list(chars))
# Use this for just checking if the font contains the codepoint given as
# second argument:
#char = int(sys.argv[2], 0)
#print(Unicode[char])
#print(char in (x[0] for x in chars))
ttf.close()
The script takes as argument the font path :
python checkfont.py /path/to/font.ttf
回答2:
The Linux program xfd can do this. It's provided in my distro as 'xorg-xfd'. To see all characters for a font, you can run this in terminal:
xfd -fa "DejaVu Sans Mono"
回答3:
fc-query my-font.ttf
will give you a map of supported glyphs and all the locales the font is appropriate for according to fontconfig
Since pretty much all modern linux apps are fontconfig-based this is much more useful than a raw unicode list
The actual output format is discussed here
http://lists.freedesktop.org/archives/fontconfig/2013-September/004915.html
回答4:
The character code points for a ttf/otf font are stored in the CMAP
table.
You can use ttx
to generate a XML representation of the CMAP
table. see here.
You can run the command ttx.exe -t cmap MyFont.ttf
and it should output a file MyFont.ttx
. Open it in a text editor and it should show you all the character code it found in the font.
回答5:
I just had the same problem, and made a HOWTO that goes one step further, baking a regexp of all the supported Unicode code points.
If you just want the array of codepoints, you can use this when peeking at your ttx
xml in Chrome devtools, after running ttx -t cmap myfont.ttf
and, probably, renaming myfont.ttx
to myfont.xml
to invoke Chrome's xml mode:
function codepoint(node) { return Number(node.nodeValue); }
$x('//cmap/*[@platformID="0"]/*/@code').map(codepoint);
(Also relies on fonttools
from gilamesh's suggestion; sudo apt-get install fonttools
if you're on an ubuntu system.)
回答6:
The fontconfig
commands can output the glyph list as a compact list of ranges, eg:
$ fc-match --format=%{charset} OpenSans
20-7e a0-17f 192 1a0-1a1 1af-1b0 1f0 1fa-1ff 218-21b 237 2bc 2c6-2c7 2c9
2d8-2dd 2f3 300-301 303 309 30f 323 384-38a 38c 38e-3a1 3a3-3ce 3d1-3d2 3d6
400-486 488-513 1e00-1e01 1e3e-1e3f 1e80-1e85 1ea0-1ef9 1f4d 2000-200b
2013-2015 2017-201e 2020-2022 2026 2030 2032-2033 2039-203a 203c 2044 2070
2074-2079 207f 20a3-20a4 20a7 20ab-20ac 2105 2113 2116 2120 2122 2126 212e
215b-215e 2202 2206 220f 2211-2212 221a 221e 222b 2248 2260 2264-2265 25ca
fb00-fb04 feff fffc-fffd
Use fc-query
for a .ttf
file and fc-match
for an installed font name.
This likely doesn't involve installing any extra packages, and doesn't involve translating a bitmap.
回答7:
If you ONLY want to "view" the fonts, the following might be helpful (if your terminal supports the font in question):
#!/usr/bin/env python
import sys
from fontTools.ttLib import TTFont
with TTFont(sys.argv[1], 0, ignoreDecompileErrors=True) as ttf:
for x in ttf["cmap"].tables:
for (_, code) in x.cmap.items():
point = code.replace('uni', '\\u').lower()
print("echo -e '" + point + "'")
An unsafe, but easy way to view:
python font.py my-font.ttf | sh
Thanks to Janus (https://stackoverflow.com/a/19438403/431528) for the answer above.
回答8:
You can do this on Linux in Perl using the Font::TTF module.