Is it UTF-16? 32? Something else?
I want to look into this for performance reasons, since I'm converting a lot of strings from UTF-8 to "native NSString
", and the performance penalty seems to land on __CFFromUTF8
, which is a built-in conversion function. Btw: I'm just guessing thatNSUnicodeStringEncoding
is what is used internally, since NSString
's fastestEncoding
returns that value (i.e. for international strings; when using ANSI, MacRomans is returned).
Testing using dataUsingEncoding:
indicates NSUnicodeStringEncoding
is little-endian UTF-16 preceded with a byte order mark (on both the simulator and a real device) and Apple's String Programming Guide for Cocoa says "NSString objects are conceptually UTF-16 with platform endianness", so I'd think it's reasonable to assume UTF-16 is used internally.
(the same guide goes on to say "That doesn’t necessarily imply anything about their internal storage mechanism", so they're fully reserving the right to change this in future)