I'm having a problem with converting text from and to UTF-8 encoding. Here I have byte array,
byte[] c = new byte[] { 1, 2, 200 };
I'm converting it to UTF-8 string and back to byte array,
Encoding.UTF8.GetBytes(Encoding.UTF8.GetString(c));
According to my understand what i should be expecting from this is an array with 3 bytes. right? But here's what I'm getting.
byte[5] { 1, 2, 239, 191, 189 }
What's the reason for this?
I understand the 239, 191, 189
combination is called REPLACEMENT CHARACTER
in UTF-8 Specials.
Also this is part of a bigger problem.
Not all sequences of bytes are valid UTF-8. It seems that your array (1, 2, 200) is invalid in UTF-8 (that's why this special error character is added)
MSDN says about Encoding.UTF8:
1) There are no BOM (https://en.wikipedia.org/wiki/Byte_order_mark) in your example.
2) 200 - a leading byte. It must be followed by enough continuation bytes