iOS CFStringTransform and Đ

2020-02-14 10:21发布

问题:

I'm working on an iOS app in which I have to list and sort people names. I've some problem with special character.

I need some clarification on Martin R answer on https://stackoverflow.com/a/15154823/2148377

You could use the CoreFoundation CFStringTransform function which does almost all transformations from your list. Only "đ" and "Đ" have to be handled separately:

Why this particular letter? Where does this come from? Where can I find the documentation?

Thanks a lot.

回答1:

I am not 100% sure, but I think it can be seen from the Unicode Data Base http://www.unicode.org/Public/6.2.0/ucd/UnicodeData.txt.

For example, the entry for "à" is

00E0;LATIN SMALL LETTER A WITH GRAVE;Ll;0;L;0061 0300;;;;N;LATIN SMALL LETTER A GRAVE;;00C0;;00C0

where field #6 is the "Decomposition mapping" into "a" + U+0300 (COMBINING GRAVE ACCENT), therefore

CFStringTransform(..., kCFStringTransformStripCombiningMarks, ...)

transforms "à" into "a".

The entries for "Đ" and "đ" are

0110;LATIN CAPITAL LETTER D WITH STROKE;Lu;0;L;;;;;N;LATIN CAPITAL LETTER D BAR;;;0111;
0111;LATIN SMALL LETTER D WITH STROKE;Ll;0;L;;;;;N;LATIN SMALL LETTER D BAR;;0110;;0110

where field #6 is empty, so these characters do not have a decomposition into a "base character" and a "combining mark".

So the question remains: Which standard determines that a "normalized form" of "đ / Đ" is "d / D"?