.NET System.Text.Decoder.Convert method returning

I need to read a string from a sequence of bytes which is UTF-8. The source of these bytes come in in separate read operations, which won't respect character boundaries, so I cannot use System.Text.Encoding.UTF8.GetString. But, the System.Text.Decoder class, as returned by System.Text.Encoding.UTF8.GetDecoder() appears to be designed for this scenario. One of the OUT arguments looks like it should indicate when a character has only been partially read.

The documentation for Convert (at https://msdn.microsoft.com/en-us/library/h6w985hz(v=vs.110).aspx) suggests that the completed value should be false, if either the output ( char[] ) buffer was too small, or not all the bytes could be converted. See Remarks paragraph 4.

However, the completed value appears to be TRUE even when the docs says it should be false, when the bytes of a character have not been completely converted.

I presume I'm doing something wrong (or this is a bug ??), and if so, how can I detect if my byte stream is paused in the middle of a character ?

demonstration code:

const int outSize = 10;
char[] outBuf = new char[outSize];
byte[] frag1 = new byte[] { 0xE7 };
byte[] frag2 = new byte[] { 0x95, 0xA2 };
var decoder = System.Text.Encoding.UTF8.GetDecoder();
int bytesUsed, charsUsed; bool completed;

// the first byte of the UTF-8 character
decoder.Convert(frag1, 0, frag1.Length, outBuf, 0, outSize, false, out bytesUsed, out charsUsed, out completed);
Debug.Assert( bytesUsed == 1 );
Debug.Assert( charsUsed == 0 );

// // // // // // // // // // // //  completed is true here, but WHY ?
Debug.Assert( ! completed);
// // // // // // // // // // // // 

// the second and third bytes of the UTF-8 character
decoder.Convert(frag2, 0, frag2.Length, outBuf, 0, outSize, false, out bytesUsed, out charsUsed, out completed);
Debug.Assert(bytesUsed == 2);
Debug.Assert(charsUsed == 1);
Debug.Assert(completed);
Debug.Assert( new String(outBuf, 0, 1 ) == "畢" );

Thanks!

标签： c# .net unicode utf-8

0条回答

.NET System.Text.Decoder.Convert method returning

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间