Convert to UCS2

2019-01-11 19:42发布

问题:

Is there any function in Vb.net (or C#) that encodes a string in UCS2?

Thanks

回答1:

Use the following functions to encode unicode string in "UCS2" format:

    //================> Used to encoding GSM message as UCS2
    public static String UnicodeStr2HexStr(String strMessage)
    {
        byte[] ba = Encoding.BigEndianUnicode.GetBytes(strMessage);
        String strHex = BitConverter.ToString(ba);
        strHex = strHex.Replace("-", "");
        return strHex;
    }

    public static String HexStr2UnicodeStr(String strHex)
    {
        byte[] ba = HexStr2HexBytes(strHex);
        return HexBytes2UnicodeStr(ba);
    }

    //================> Used to decoding GSM UCS2 message  
    public static String HexBytes2UnicodeStr(byte[] ba)
    {
        var strMessage = Encoding.BigEndianUnicode.GetString(ba, 0, ba.Length);
        return strMessage;
    }

    public static byte[] HexStr2HexBytes(String strHex)
    {
        strHex = strHex.Replace(" ", "");
        int nNumberChars = strHex.Length / 2;
        byte[] aBytes = new byte[nNumberChars];
        using (var sr = new StringReader(strHex))
        {
            for (int i = 0; i < nNumberChars; i++)
                aBytes[i] = Convert.ToByte(new String(new char[2] { (char)sr.Read(), (char)sr.Read() }), 16);
        }
        return aBytes;
    }

for example:

String strE = SmsEngine.UnicodeStr2HexStr("سلام به گچپژ پارسي");
// strE = "0633064406270645002006280647002006AF0686067E06980020067E062706310633064A"
String strD = SmsEngine.HexStr2UnicodeStr("0633064406270645002006280647002006AF0686067E06980020067E062706310633064A");
// strD = "سلام به گچپژ پارسي"


回答2:

No, .NET supports the full Unicode range for strings and many encodings that derive from System.Text.Encoding. You can trivially get UTF-16, but not UCS-2. However, if you first get rid of all surrogate pairs in the input string, then UTF-16 is UCS-2. But there's no built-in encoding that does that for you.



回答3:

See Encoding.Unicode.

Given a .NET String, call Encoding.GetBytes to get a byte array representing that string encoded in UCS2.

Edit: In the context of System.Text.Encoding, Unicode = UTF-16. As Johannes points out, these are not the same thing in the presence of surrogates.