Whats going on with this byte array?

2020-06-03 01:56发布

问题:

I have a byte array: 00 01 00 00 00 12 81 00 00 01 00 C8 00 00 00 00 00 08 5C 9F 4F A5 09 45 D4 CE

It is read via StreamReader using UTF8 encoding

// Note I can't change this code, to many component dependent on it.
using (StreamReader streamReader = 
    new StreamReader(responseStream, Encoding.UTF8, false))
{
    string streamData = streamReader.ReadToEnd();
    if (requestData.Callback != null)
    {
        requestData.Callback(response, streamData);
    }
}

When that function runs I get the following returned to me (i converted to a byte array)

00 01 00 00 00 12 EF BF BD 00 00 01 00 EF BF BD 00 00 00 00 00 08 5C EF BF BD 4F EF BF BD 09 45 EF BF BD

Somehow I need to take whats returned to me and get it back to the right encoding and the right byte array, but I've tried alot.

Please be aware, I'm working with WP7 limited API.

Hopefully you guys can help.

Thanks!

Update for help...

if I do the following code, it's almost right, only thing that is wrong is the 5th to last byte gets split out.

byte[] writeBuf1 = System.Text.Encoding.UTF8.GetBytes(data);
                    string buf1string = System.Text.Encoding.BigEndianUnicode.GetString(writeBuf1, 0, writeBuf1.Length);
                    byte[] writeBuf = System.Text.Encoding.BigEndianUnicode.GetBytes(buf1string);

回答1:

The original byte array is not encoded as UTF-8. The StreamReader therefore replaces each invalid byte with the replacement character U+FFFD. When that character gets encoded back to UTF-8, this results in the byte sequence EF BF BD. You cannot construct the original byte value from the string because the information is completely lost.