How to convert hebrew (unicode) to Ascii in c#?

2019-05-01 10:13发布

I have to create some sort of text file in which there are numbers and Hebrew letters decoded to ASCII.

This is file creation method which triggers on ButtonClick

protected void ToFile(object sender, EventArgs e)
{
    filename = Transactions.generateDateYMDHMS();
    string path = string.Format("{0}{1}.001", Server.MapPath("~/transactions/"), filename);
    StreamWriter sw = new StreamWriter(path, false, Encoding.ASCII);
    sw.WriteLine("hello");
    sw.WriteLine(Transactions.convertUTF8ASCII("שלום"));
    sw.WriteLine("bye");
    sw.Close();
}

as you can see, i use Transactions.convertUTF8ASCII() static method to convert from probably Unicode string from .NET to ASCII representation of it. I use it on term Hebrew 'shalom' and get back '????' instead of result i need.

Here is the method.

public static string convertUTF8ASCII(string initialString)
{
    byte[] unicodeBytes = Encoding.Unicode.GetBytes(initialString);
    byte[] asciiBytes = Encoding.Convert(Encoding.Unicode, Encoding.ASCII, unicodeBytes);
    return Encoding.ASCII.GetString(asciiBytes);
}

Instead of having initial word decoded to ASCII i get '????' in the file i create even if i run debbuger i get same result.

What i'm doing wrong ?

4条回答
萌系小妹纸
2楼-- · 2019-05-01 10:20

I just faced the same issue when original xml file was in ASCII Encoding.

As Userx suggested

Encoding.GetEncoding(1255)

XDocument.Parse(System.IO.File.ReadAllText(xmlPath, Encoding.GetEncoding(1255)));

So now my XDocument file can read hebrew even if the xml file was saved as ASCII

查看更多
三岁会撩人
3楼-- · 2019-05-01 10:23

Do you perhaps mean ANSI, not ASCII?

ASCII doesn't define any Hebrew characters. There are however some ANSI code pages which do such as "windows-1255"

In which case, you may want to consider looking at: http://msdn.microsoft.com/en-us/library/system.text.encoding.aspx

In short, where you have:

Encoding.ASCII

You would replace it with:

Encoding.GetEncoding(1255)
查看更多
爷、活的狠高调
4楼-- · 2019-05-01 10:34

Are you perhaps asking about transliteration (as in "Romanization") instead of encoding conversion, if you really are talking about ASCII?

查看更多
唯我独甜
5楼-- · 2019-05-01 10:43

You can't simply translate arbitrary unicode characters to ASCII. The best it can do is discard the unsupportable characters, hence ????. Obviously the basic 7-bit characters will work, but not much else. I'm curious as to what the expected result is?

If you need this for transfer (rather than representation) you might consider base-64 encoding of the underlying UTF8 bytes.

查看更多
登录 后发表回答