I am working on an SMS application for the iPhone. I need to detect if the user has entered any unicode characters inside the NSString they wish to send.
I need to do this is because unicode characters take up more space in the message, and also because I need to convert them into their hexadecimal equivalents.
So my question is how do I detect the presence of a unicode character in an NSString (which I read from a UITextView). Also, how do I then convert those characters into their UCS‐2 hexadecimal equivalents?
E.g 繁 = 7E41, 体 = 4F53, 中 = 4E2D, 文 = 6587
To check for only ascii characters (or another encoding of your choice) use:
[myString canBeConvertedToEncoding:NSASCIIStringEncoding];
It will return NO if the string contains non-ascii characters. You can then convert the string to UCS-2 data with:
[myString dataUsingEncoding:NSUTF16BigEndianStringEncoding];
or NSUTF16LittleEndianStringEncoding depending on your platform. There are slight differences between UCS-2 and UTF-16. UTF-16 has superseded UCS-2. You can read about the differences here:
http://en.wikipedia.org/wiki/UTF-16/UCS-2
I couldn't get this to work.
I has a html string with
NON BREAKING SPACE
</div>Great Guildford St/SouthwarkSt & nbsp;Stop:& nbsp; BM<br>Walk to SE1 0HL<br>
"Great Guildford St/SouthwarkSt \U00a0Stop:\U00a0 BM",
I tried 3 types of encode/decode
// NSData *asciiData = [instruction dataUsingEncoding:NSUTF16BigEndianStringEncoding];
// NSString *asciiString = [[NSString alloc] initWithData:asciiData
// encoding:NSUTF16BigEndianStringEncoding];
// NSData *asciiData = [instruction dataUsingEncoding:NSASCIIStringEncoding];
// NSString *asciiString = [[NSString alloc] initWithData:asciiData
// encoding:NSASCIIStringEncoding];
//little endian
NSData *asciiData = [instruction dataUsingEncoding:NSUTF16LittleEndianStringEncoding];
NSString *asciiString = [[NSString alloc] initWithData:asciiData
encoding:NSUTF16LittleEndianStringEncoding];
none of these worked.
They seemed to work as if I NSLog the string it looks ok
NSLog(@"HAS UNICODE :%@", instruction);
..do encode/decode
NSLog(@"UNICODE AFTER:%@", asciiString);
Which output
HAS UNICODE: St/SouthwarkSt Stop: BM
UNICODE AFTER: St/SouthwarkSt Stop: BM
but I happened to store these in an NSArray and I happened to call [stringArray description]
and all the unicode was still in there
instructionsArrayString: (
"Great Guildford St/SouthwarkSt \U00a0Stop:\U00a0 BM",
"Walk to SE1 0HL"
)
So something in NSLog hides
but it shows up in NSArray description so you may think youve removed the Unicode when you haven't.
Will try another method that replace the characters.