How do web browsers implement font fallback?

2019-02-17 07:13发布

问题:

I'm interested in knowing where font fallback fits in the font shaping/rendering stack. In other words, at what point are missing glyphs detected and how are they substituted?

I see in this document that the FontConfig tool does font fallback "based on glyph coverage transparently."

So the questions are:

  1. How exactly does this algorithm work?
  2. Is this the standard algorithm used by most browsers - webkit, gecko (probably not IE)?
  3. How does font fallback based on missing glyphs within a font that does exist relate to CSS font fallback (which specifies which fonts to use in turn, when a font is entirely missing)?

Edit: I found this document which explains the "what" of FontConfig, but not the "how." Question 1 is about the "how."

To summarize - this post really has to do with one thing only - how does font fallback work when glyphs are missing in a font.

回答1:

Font fallback in browsers (as opposed to, say, in an OS) is based on two things:

  1. The CSS specification, which gives the fonts that are to be used for fallback, and
  2. The text engine, which does text shaping.

The CSS spec is fairly trivial in this respect, simply giving the list of fonts using their system names, but several possible "catch all" fonts that are in no way guaranteed to be the same from computer to computer (there is no reason to assume that serif maps to Times or Times New Roman, for instance).

The fallback algorithm used by text engines is entirely up to the engine, but usually kicks in during the glyph lookup step: the text engine sees a string of code points, and tries to use a font to shape that string. For each point in the sequence, it checks whether the font has a matching glyph (by consulting the CMAP table and subtables), or a rule that tells the engine that there may be a glyph to use only if more code points follow, through the GSUB mechanism (For instance, a font without glyphs for the individual letters e, t and c, but with a glyph for & and a GSUB rule that says the sequence e+t+c should be in-text replaced with the single glyph &), and when it's finished accumulating this kind of "unit of points", it shapes the text and hands it back to whatever asked it to shape text.

If, during glyph lookup, it turns out the font doesn't contain anything that lets the engine shape a particular code point (i.e. running through the CMAP data as well as the GSUB rules still shows "there is no glyph") then the text engine can do two things:

  1. Give up. There is no glyph, instead use the .notdef outline defined as glyph id 0, and generally give you text with lovely empty boxes (lovingly called "tofu" by font folks) or question marks.
  2. Attempt font fallback, where it will try another font to find a glyph for the unsupported code point in.

When using fallback, an engine can go down a list of alternative fonts until either: (a) a glyph is found, or (b) the list is exhausted, at which point the engine has to give up, and will use the .notdef glyph. Whether the engine grabs the .notdef glyph from the original font, or from the last font in the list, is entirely up to the engine (although usually it'll go with the first font, for legibility)

There is no "standard" algorithm for this defined anywhere; font fallback is basically a convenience mechanism offered by text engine authors, like how browsers come with bookmark managers (handy, and not part of any spec). As far as OpenType is concerned, there are no requirements on whether an engine should just serve up .notdef when a glyph is not found, or whether it should serve up the part it could shape, then find the missing glyph somewhere else, and render text that way. CSS implies that your text engine should have at least some form of font fallback, but it doesn't specify how it should work, or when it should kick in.