How do I get a consistent byte representation of s-第4页回答

How do I convert a string to a byte[] in .NET (C#) without manually specifying a specific encoding?

I'm going to encrypt the string. I can encrypt it without converting, but I'd still like to know why encoding comes to play here.

Also, why should encoding be taken into consideration? Can't I simply get what bytes the string has been stored in? Why is there a dependency on character encodings?

标签： c# .net string character-encoding

30条回答

高级女魔头

2楼-- · 2018-12-31 02:09

You need to take the encoding into account, because 1 character could be represented by 1 or more bytes (up to about 6), and different encodings will treat these bytes differently.

Joel has a posting on this:

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

0人赞添加讨论(0) 举报

余欢

3楼-- · 2018-12-31 02:09

Simply use this:

byte[] myByte= System.Text.ASCIIEncoding.Default.GetBytes(myString);

0人赞添加讨论(0) 举报

泛滥B

4楼-- · 2018-12-31 02:10

Try this, a lot less code:

System.Text.Encoding.UTF8.GetBytes("TEST String");

0人赞添加讨论(0) 举报

看淡一切

5楼-- · 2018-12-31 02:10

If you really want a copy of the underlying bytes of a string, you can use a function like the one that follows. However, you shouldn't please read on to find out why.

[DllImport(
        "msvcrt.dll",
        EntryPoint = "memcpy",
        CallingConvention = CallingConvention.Cdecl,
        SetLastError = false)]
private static extern unsafe void* UnsafeMemoryCopy(
    void* destination,
    void* source,
    uint count);

public static byte[] GetUnderlyingBytes(string source)
{
    var length = source.Length * sizeof(char);
    var result = new byte[length];
    unsafe
    {
        fixed (char* firstSourceChar = source)
        fixed (byte* firstDestination = result)
        {
            var firstSource = (byte*)firstSourceChar;
            UnsafeMemoryCopy(
                firstDestination,
                firstSource,
                (uint)length);
        }
    }

    return result;
}

This function will get you a copy of the bytes underlying your string, pretty quickly. You'll get those bytes in whatever way they are encoding on your system. This encoding is almost certainly UTF-16LE but that is an implementation detail you shouldn't have to care about.

It would be safer, simpler and more reliable to just call,

System.Text.Encoding.Unicode.GetBytes()

In all likelihood this will give the same result, is easier to type, and the bytes will always round-trip with a call to

System.Text.Encoding.Unicode.GetString()

0人赞添加讨论(0) 举报

梦寄多情

6楼-- · 2018-12-31 02:12

Two ways:

public static byte[] StrToByteArray(this string s)
{
    List<byte> value = new List<byte>();
    foreach (char c in s.ToCharArray())
        value.Add(c.ToByte());
    return value.ToArray();
}

And,

public static byte[] StrToByteArray(this string s)
{
    s = s.Replace(" ", string.Empty);
    byte[] buffer = new byte[s.Length / 2];
    for (int i = 0; i < s.Length; i += 2)
        buffer[i / 2] = (byte)Convert.ToByte(s.Substring(i, 2), 16);
    return buffer;
}

I tend to use the bottom one more often than the top, haven't benchmarked them for speed.

0人赞添加讨论(0) 举报

谁念西风独自凉

7楼-- · 2018-12-31 02:13

It depends on the encoding of your string (ASCII, UTF-8, ...).

For example:

byte[] b1 = System.Text.Encoding.UTF8.GetBytes (myString);
byte[] b2 = System.Text.Encoding.ASCII.GetBytes (myString);

A small sample why encoding matters:

string pi = "\u03a0";
byte[] ascii = System.Text.Encoding.ASCII.GetBytes (pi);
byte[] utf8 = System.Text.Encoding.UTF8.GetBytes (pi);

Console.WriteLine (ascii.Length); //Will print 1
Console.WriteLine (utf8.Length); //Will print 2
Console.WriteLine (System.Text.Encoding.ASCII.GetString (ascii)); //Will print '?'

ASCII simply isn't equipped to deal with special characters.

Internally, the .NET framework uses UTF-16 to represent strings, so if you simply want to get the exact bytes that .NET uses, use System.Text.Encoding.Unicode.GetBytes (...).

See Character Encoding in the .NET Framework (MSDN) for more information.

0人赞添加讨论(0) 举报

How do I get a consistent byte representation of s

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间