Read UTF8 character in specify position from a NSS


    NSString* str = @"1二3四5";
    NSLog(@"%c",[str characterAtIndex:0]); 
    NSLog(@"%c",[str characterAtIndex:1]);

NSString - characterAtIndex works well on ASCII chars, but how could I get the UTF8 character at the index of 2?

-- updated --
It seems unichar(16bits) can't represent all the UTF8 encoding strings(8bites to 32bites), so are there any method to get the char from NSString?

标签： iphone objective-c macos nsstring

3条回答

The star\"

2楼-- · 2020-05-21 05:50

Why don't you try to use something like that:

const char *yourWantedCharacter = [[yourSourceString substringWithRange:yourRange] UTF8String];

where yourSourceString is your NSString object, yourRange is an NSRange object with the index of the needed character as the location parameter and an length parameter of '0' (zero).

0人赞添加讨论(0) 举报

相关推荐>>

3楼-- · 2020-05-21 05:52

You'd use the more verbose methods:

NSRange rangeOfSecondCharacter = [str rangeOfComposedCharacterSequenceAtIndex:1];
NSString *secondCharacter = [str substringWithRange:rangeOfSecondCharacter];

...with proper bounds and range checking, of course. Note that this gives you an NSString, an object, not a unichar or some other primitive data type.

0人赞添加讨论(0) 举报

Ridiculous、

4楼-- · 2020-05-21 05:57

Unfortunately Dave's answer doesn't actually do what you want. The index supplied to rangeOfComposedCharacterSequenceAtIndex is an index of a UTF-16 code unit, 1 or 2 or which make a UTF-16 code point. So 1 is not the second UTF-16 code point if the first code point in the string requires 2 code units... (rangeOfComposedCharacterSequenceAtIndex returns the range of the code point which includes the code unit at the given index, so if your first char requires 2 code units then passing an index of 0 or 1 returns the same range).

If you want to find the UTF-8 sequence for a character you can use UTF8String and then parse the resultant bytes to find the byte sequence for the nth character. Or you can likewise use rangeOfComposedCharacterSequenceAtIndex starting at index 0 and iterate till you get to the nth character, then convert the 1 or 2 UTF-16 code units to UTF-8 code units.

I hope we're all missing something and this is built-in...

A start (needs bounds checking!) of a category which might help:

@interface NSString (UTF)

- (NSRange) rangeOfUTFCodePoint:(NSUInteger)number;

@end

@implementation NSString (UTF)

- (NSRange) rangeOfUTFCodePoint:(NSUInteger)number
{
    NSUInteger codeUnit = 0;
    NSRange result;
    for(NSUInteger ix = 0; ix <= number; ix++)
    {
        result = [self rangeOfComposedCharacterSequenceAtIndex:codeUnit];
        codeUnit += result.length;
    }
    return result;
}

@end

but this sort of stuff is more efficient using char * rather then NSString

0人赞添加讨论(0) 举报

Read UTF8 character in specify position from a NSS

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间