Decode UTF8 String to Latin

2019-04-15 11:06发布

I am trying to transform a UTF8 string to Latin characters.

Here's a example of how I am trying to achieve this:

string sUnicode ="Peneda-Gerês";
string result = Encoding.Unicode.GetString(Encoding.Convert(Encoding.UTF8, Encoding.Unicode, Encoding.UTF8.GetBytes(sUnicode)));

MessageBox.Show(result);

The string return is the same? No change?

What am i missing?

If I go to this site

http://software.hixie.ch/utilities/cgi/unicode-decoder/utf8-decoder

and put the same text, it gets decode correctly to "Peneda-Gerês";

标签: .net encoding
2条回答
等我变得足够好
2楼-- · 2019-04-15 11:42

You are converting from unicode to utf8 to unicode. So the result is the same as the source.

       byte[] byteAr = {
                            (byte) 'P', (byte) 'e', (byte) 'n', (byte) 'e', (byte) 'd', (byte) 'a', (byte) '-',
                            (byte) 'G', (byte) 'e', (byte) 'r', (byte) 'Ã', (byte) 'ª', (byte) 's'
                        };

       var result = Encoding.Unicode.GetString(Encoding.Convert(Encoding.UTF8, Encoding.Unicode, byteAr));
查看更多
Juvenile、少年°
3楼-- · 2019-04-15 11:49

Your source string is in ISO-8859-1

Run this and pick the correct encoder:

 string sUnicode = "Peneda-Gerês";
 foreach (var enc in Encoding.GetEncodings())
 {
    Console.WriteLine("{0} {1}"
        , Encoding.UTF8.GetString(enc.GetEncoding().GetBytes(sUnicode))
        , enc.Name);
 }

Or to be spot on:

string result = Encoding.UTF8.GetString(
     Encoding.GetEncoding("ISO-8859-1").GetBytes(sUnicode));
查看更多
登录 后发表回答