No UTF-32 big-endian in C#?

2019-04-28 17:29发布

问题:

In C#, Encoding.UTF32 is UTF-32 little-endian, Encoding.BigEndianUnicode is UTF-16 big-endian, Encoding.Unicode is UTF-16 little-endian. But I can't find any for UTF-32 big-endian.

I'm developing a simple textviewer and don't think there are many documents encoded in UTF-32 big-endian but I want to prepare for that too, just in case.

Doesn't C# support UTF32 big-endian?

BTW Java supports it.

回答1:

It does support big endian on UTF-32. Just create the encoding yourself using the overloaded constructor:

Encoding e = new UTF32Encoding(true /*bigEndian*/, true /*byteOrderMark*/);

The encodings predefined as static on Encoding aren't an exhaustive list. You can create much and much more other encodings.



回答2:

//https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding?view=netframework-4.7.2
//12000 utf-32  Unicode (UTF-32)    ✓   ✓
//12001 utf-32BE    Unicode (UTF-32 Big endian)
const string strUniRepChr = "�"; //Unicode Character 'REPLACEMENT CHARACTER' (U+FFFD)
Encoding cpUTF32 = Encoding.GetEncoding(12000,
                   new EncoderReplacementFallback(strUniRepChr),
                   new DecoderReplacementFallback(strUniRepChr) );