Encode in webclient unexpected result

2019-05-06 18:30发布

I try use webclient to translate word 'Banana' into rus

private void button1_Click(object sender, EventArgs e)
    {
        Navigate("http://translate.google.ru/translate_a/t?client=x&text=Banana&hl=en&sl=en&tl=ru");
    }

    private void Navigate(String address)
    {
        WebClient client = new WebClient();            
        client.Proxy = WebRequest.DefaultWebProxy;
        client.Credentials = new NetworkCredential("user", "password", "domain");
        client.Proxy.Credentials = new NetworkCredential("user", "password", "domain");
        string _stranslate = client.DownloadString(new Uri(address));
    }

And I want to see in "_stranslate "

{"sentences":[{"trans":"Банан","orig":"Banana@","translit":"Banan @","src_translit":""}],"src":"en","server_time":0}

but got this

{"sentences":[{"trans":"вБОБО","orig":"Banana@","translit":"Banan @","src_translit":""}],"src":"en","server_time":0}

Can some one help me?

3条回答
太酷不给撩
2楼-- · 2019-05-06 18:48

Add this before your client.DownloadString():

client.Encoding = System.Text.Encoding.UTF8;

Your encoding is likely getting messed up when you read the string.

Using this HTTP header viewer and putting in your URL, I see the following in the headers:

Content-Type: text/javascript; charset=UTF-8
Content-Language: ru

Basically, you need to find out what encoding they are sending back and set your encoding to match.

It is very important to set the encoding before you call DownloadString().

查看更多
我想做一个坏孩纸
3楼-- · 2019-05-06 19:02

IMHO better solution: add URI query parameter oe=UTF-8 and use UTF-8 everywhere

var nameValueCollection = new NameValueCollection
{
    {"client", "x"},
    {"text", HttpUtility.UrlEncode(text)},
    {"hl", "en"},
    {"sl", fromLanguage},
    {"tl", toLanguage},
    {"ie", "UTF-8"},
    {"oe", "UTF-8"}
};

string htmlResult;
using (var wc = new WebClient())
{
    wc.Encoding = Encoding.UTF8;
    wc.QueryString = nameValueCollection;
    htmlResult = wc.DownloadString(GoogleAddress);
}
查看更多
迷人小祖宗
4楼-- · 2019-05-06 19:07

Try checking the response headers, the content types tells you what encoding you should use.

Content-Type => text/javascript; charset=KOI8-R

So just add this line.

client.Encoding = Encoding.GetEncoding(20866);

or

client.Encoding = Encoding.GetEncoding("KOI8-R");

A complete list for encodings can be found in the documentation for the Encoding Class

Another way would be to just use System.Net.Mime.ContentType to get the charset. Like this:

byte[] data = client.DownloadData(url);
ContentType contentType = new System.Net.Mime.ContentType(client.ResponseHeaders[HttpResponseHeader.ContentType]);
string _stranslate = Encoding.GetEncoding(contentType.CharSet).GetString(data);
查看更多
登录 后发表回答