What does Cocoa Touch's canonical NSUnicodeStr

2019-07-25 15:53发布

问题:

Is it UTF-16? 32? Something else?

I want to look into this for performance reasons, since I'm converting a lot of strings from UTF-8 to "native NSString", and the performance penalty seems to land on __CFFromUTF8, which is a built-in conversion function. Btw: I'm just guessing thatNSUnicodeStringEncoding is what is used internally, since NSString's fastestEncoding returns that value (i.e. for international strings; when using ANSI, MacRomans is returned).

回答1:

Testing using dataUsingEncoding: indicates NSUnicodeStringEncoding is little-endian UTF-16 preceded with a byte order mark (on both the simulator and a real device) and Apple's String Programming Guide for Cocoa says "NSString objects are conceptually UTF-16 with platform endianness", so I'd think it's reasonable to assume UTF-16 is used internally.

(the same guide goes on to say "That doesn’t necessarily imply anything about their internal storage mechanism", so they're fully reserving the right to change this in future)