Let's say I have a random Chinese character, 玩. I want to convert it to Unicode, which would be U+73A9. How could I do this in C#?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
Take myChar as a char referencing your special character...
Console.WriteLine("{0} U+{1:x4} {2}", myChar, (int)myChar, (int)myChar);
Above we're outputting the character itself followed by the Unicode code point and then the integer value.
Reduce the format string and parameters to output only the "U+..." code...
Console.WriteLine("U+{0:x4}", (int)myChar);
回答2:
The characater 玩 is in Unicode.
If you have it in C# as 玩, then it's currently in UTF-16, which is one of the Unicode encoding forms.
If you are obtaining it from somewhere else you need to:
- Find the encoding it is in.
- Get the bytes (wrapped by a stream is nice).
- Get of write an appropriate Encoder.
- Use the encoder to get the string (wrapping the nice stream with a textreader is nicer).
Step 3 May be simple (oh, I just use that one!) or hard (darn, have to write it myself!) or somewhere in between (hey, anyone written one of these already?!)
回答3:
A bit longer example, that follows the pattern in Jon Hanna's answer:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace UnicodeDecodeConsoleApplication
{
class Program
{
static void Main(string[] args)
{
char c = '\u73a9';
char[] chars = {c};
Encoding encoding = Encoding.BigEndianUnicode;
byte[] decodeds = encoding.GetBytes(chars);
StringBuilder stringBuilder = new StringBuilder("U+");
foreach (byte decoded in decodeds)
{
stringBuilder.Append(decoded.ToString("x2"));
}
Console.WriteLine(stringBuilder);
Console.ReadLine();
}
}
}
--jeroen