I am writing a windows application. am facing problem in converting Extended ASCII[128-256] to its decimal equivalent.
when i receive the extended ASCII say for example "Œ" from a jar file, it comes into C# application like this : �.
Can i know how to convert this to its decimal equivalent [i.e] 140.
string textToConvert = "Œ";
Encoding iso8859 = Encoding.GetEncoding("iso-8859-1");
Encoding unicode = Encoding.Unicode;
byte[] srcTextBytes = iso8859.GetBytes(textToConvert);
byte[] destTextBytes = Encoding.Convert(iso8859,unicode, srcTextBytes);
char[] destChars = new char[unicode.GetCharCount(destTextBytes, 0, destTextBytes.Length)];
unicode.GetChars(destTextBytes, 0, destTextBytes.Length, destChars, 0);
System.String szchar = new System.String(destChars);
MessageBox.Show(szchar);
Please help me. How should i proceed??
I think you are looking for something like this
String str="œ";
var bytes = Encoding.GetEncoding("Windows-1252").GetBytes(s);
string binStr = string.Join("", bytes.Select(b => Convert.ToString(b, 2)));
int decimalEquivalent=Convert.ToInt32(binStr,2);
Console.WriteLine(decimalEquivalent);
this is working for ASCII [128-255]
You have the wrong encoding. The iso-8859-1 encoding don't have characters 128-159 as pointed out by Hans. According to this acrticle there are 3 encoding that contain the character you are looking for. There is iso-8859-15, Windows-1252 and the other is for mac. Since this comes from a jar file, and as such, should be os independent, I would say the right encoding is iso-8859-15.
With the right encoding your call to GetBytes should return an array that contains the decimal values.
First thing, 140 in ISO-8859-1 is U+008C - ISO-8859-1 has a direct one to one mapping between the number and the code-point - and U+008C is a control character. It famously doesn't have Œ
(Famously as there was controversy about the French having to not use the ligature if using it in cases where they normally would, while Æ
is included because in some of the languages it was meant to support it's a separate letter "ash" rather than a ligature as per its use in French. Tempers got raised).
string textToConvert = "Œ";
'"Œ"' is a string. It's got nothing to do with "extended ascii". It's implemented by UTF-16 behind the scenes, but you shouldn't even think of it as such, but rather as a string which has nothing to do with numbers, bytes or encodings until such a time as you start reading from and writing to streams (like files).
Encoding iso8859 = Encoding.GetEncoding("iso-8859-1");
As explained above, you certainly don't want this. You probably want GetEncoding("Windows-1252")
as that's a Windows encoding that matches 8859-1 except it took out some of the controls and put in some more letters, including Œ
at position 140
. Let's assume you change it that way...
byte[] srcTextBytes = iso8859.GetBytes(textToConvert);
Okay, at this point - if you change to using CP-1252- you have a byte array of a single byte, value 140 (0x8C).
byte[] destTextBytes = Encoding.Convert(iso8859,unicode, srcTextBytes);
char[] destChars = new char[unicode.GetCharCount(destTextBytes, 0, destTextBytes.Length)];
unicode.GetChars(destTextBytes, 0, destTextBytes.Length, destChars, 0);
System.String szchar = new System.String(destChars);
MessageBox.Show(szchar);
I have no idea what you're trying to do here. You started with a string, and you are ending with a string, what's going on?
Let's abandon this and start from scratch.
If you have a string and you want its bytes in CP-1252 that represent it, then:
byte[] result = Encoding.GetEncoding("Windows-1252").GetBytes(inputString);
If you have some bytes in CP-1252 and you want the string they represent:
string result = System.Text.Encoding.GetEncoding("Windows-1252").GetString(inputBytes);
If you want to read to or write from a stream (file, network stream, etc.) in Windows-1252, then use a StreamReader or StreamWriter created with that encoding:
using(TextReader reader = new StreamReader(source, Encoding.GetEncoding("Windows-1252"));
using(TextWriter writer = new StreamWriter(sink, Encoding.GetEncoding("Windows-1252"));