What is the Default Charset/Encoding of text messa

2019-02-25 01:32发布

问题:

If necessary to keep it simple I am primarily concerned with English handsets in North America.

Specifically- when sending/recieving SMS and MMS messages, how are the characters encoded? Is there a difference?

My initial research suggested that UTF-8 was the default, but I have also seen references to US-ASCII for US devices and other charsets for other locales.

回答1:

Quote:
The platform's default charset is UTF-8. (This is in contrast to some older implementations, where the default charset depended on the user's locale.)

More information can be found here: Charset| Android Developers



回答2:

The short answer for the US is GSM 03.38 and UTF-16BE if you use Emojis or text that GSM 03.38 cannot encode directly.

When sending/receiving SMS the encoding is definitely not UTF-8 since that isn't supported by the PDU or the SMPP protocol. Search for the SMPP spec for clarification on what is supported. Out of all supported encodings, the only Unicode compatible option is UCS-2BE. My observation is that most phones (includes all Android and iPhone) just assume this is actually UTF-16BE because it allows for the complete Unicode character set (including things like Emojis