Inspired by List of all unicode's open/close brackets? I'm trying to find a list of all unicode glyphs in a given font that are reflections of each other. First I just need to be able to test whether one glyph is a reflection of another. Below I have two different attempts (two different implementations of my render_char
function) but I'm not able to identify '(' and ')' as mirror images using either one. How can I do this?
from PIL import Image,ImageDraw,ImageFont
import freetype
import numpy as np
def render_char0(c):
# Based on https://github.com/rougier/freetype-py/blob/master/examples/hello-world.py
# Needs numpy (blech) and the image comes out the inverse of the way I expect
face = freetype.Face("/Library/Fonts/Verdana.ttf")
face.set_char_size( 48*64 )
face.load_char(c)
bitmap = face.glyph.bitmap
w,h = bitmap.width, bitmap.rows
Z = np.array(bitmap.buffer, dtype=np.ubyte).reshape(h,w)
return Image.fromarray(Z, mode='L').convert('1')
def render_char1(c):
# Based on https://stackoverflow.com/a/14446201/2829764
verdana_font = ImageFont.truetype("/Library/Fonts/Verdana.ttf", 20, encoding="unic")
text_width, text_height = verdana_font.getsize(c)
canvas = Image.new('RGB', (text_width+10, text_height+10), (255, 255, 255))
draw = ImageDraw.Draw(canvas)
draw.text((5,5), c, font = verdana_font, fill = "#000000")
return canvas
for render_char in [render_char0, render_char1]:
lparen = render_char('(')
rparen = render_char(')')
mirror = lparen.transpose(Image.FLIP_LEFT_RIGHT)
mirror.show()
rparen.show()
print mirror.tobytes() == rparen.tobytes() # False
There is a text file called
BidiMirroring.txt
in the Unicode plain-text database with a list of all mirrored characters. That file is easy to parse by programs.Current url is http://www.unicode.org/Public/UNIDATA/BidiMirroring.txt
I don't think using the rendered glyphs can work reliably. There's a lot of reasons why eg.
(
and)
are no exact mirror images, like spacing around the character, hinting and anti-aliasing, maybe the font is slightly slanted, or maybe the font designer has just make the two brackets a bit different etc. Other characters are rotated, rather than mirrored, like“
and”
in some fonts, and the Chinese quotation marks「
and」
.I think rendering is the wrong aproach. It depends on the font and wether the font knows how to render this. I heard that unicode characters have a specification for this symmetry. Maybe it is encoded in their name. "LEFT" and "RIGHT" "SUBSCRIPT". Have a look at http://xahlee.info/comp/unicode_matching_brackets.html