Simplest way to extract first Unicode codepoint of

2020-04-08 11:33发布

For historical reasons, Cocoa's Unicode implementation is 16-bit: it handles Unicode characters above 0xFFFF via "surrogate pairs". This means that the following code is not going to work:

NSString myString = @"                

1条回答
手持菜刀,她持情操
2楼-- · 2020-04-08 12:20

A single Unicode code point might be a Surrogate Pair, but also not all language characters are single code points. i.e. not all language characters are represented by one or two UTF-16 units. Many characters are represented by a sequence of Unicode code points.

This means that unless you are dealing with Ascii you have to think of language characters as substrings, not unicode code points at indexes.

To get the substring for the character at index 0:

NSRange r = [[myString rangeOfComposedCharacterSequenceAtIndex:0];
[myString substringWithRange:r];

This may or may not be what you want depending on what you are actually hoping to do. e.g. although this will give you 'character boundaries' these won't correspond to cursor insertion points, which are language specific.

查看更多
登录 后发表回答