How do you convert a byte array to a hexadecimal s

2018-12-30 23:20发布

How can you convert a byte array to a hexadecimal string, and vice versa?

标签: c# arrays hex
30条回答
与风俱净
2楼-- · 2018-12-30 23:57

And for inserting into an SQL string (if you're not using command parameters):

public static String ByteArrayToSQLHexString(byte[] Source)
{
    return = "0x" + BitConverter.ToString(Source).Replace("-", "");
}
查看更多
几人难应
3楼-- · 2018-12-30 23:58

Either:

public static string ByteArrayToString(byte[] ba)
{
  StringBuilder hex = new StringBuilder(ba.Length * 2);
  foreach (byte b in ba)
    hex.AppendFormat("{0:x2}", b);
  return hex.ToString();
}

or:

public static string ByteArrayToString(byte[] ba)
{
  return BitConverter.ToString(ba).Replace("-","");
}

There are even more variants of doing it, for example here.

The reverse conversion would go like this:

public static byte[] StringToByteArray(String hex)
{
  int NumberChars = hex.Length;
  byte[] bytes = new byte[NumberChars / 2];
  for (int i = 0; i < NumberChars; i += 2)
    bytes[i / 2] = Convert.ToByte(hex.Substring(i, 2), 16);
  return bytes;
}

Using Substring is the best option in combination with Convert.ToByte. See this answer for more information. If you need better performance, you must avoid Convert.ToByte before you can drop SubString.

查看更多
ら面具成の殇う
4楼-- · 2018-12-30 23:58

This is an answer to revision 4 of Tomalak's highly popular answer (and subsequent edits).

I'll make the case that this edit is wrong, and explain why it could be reverted. Along the way, you might learn a thing or two about some internals, and see yet another example of what premature optimization really is and how it can bite you.

tl;dr: Just use Convert.ToByte and String.Substring if you're in a hurry ("Original code" below), it's the best combination if you don't want to re-implement Convert.ToByte. Use something more advanced (see other answers) that doesn't use Convert.ToByte if you need performance. Do not use anything else other than String.Substring in combination with Convert.ToByte, unless someone has something interesting to say about this in the comments of this answer.

warning: This answer may become obsolete if a Convert.ToByte(char[], Int32) overload is implemented in the framework. This is unlikely to happen soon.

As a general rule, I don't much like to say "don't optimize prematurely", because nobody knows when "premature" is. The only thing you must consider when deciding whether to optimize or not is: "Do I have the time and resources to investigate optimization approaches properly?". If you don't, then it's too soon, wait until your project is more mature or until you need the performance (if there is a real need, then you will make the time). In the meantime, do the simplest thing that could possibly work instead.

Original code:

    public static byte[] HexadecimalStringToByteArray_Original(string input)
    {
        var outputLength = input.Length / 2;
        var output = new byte[outputLength];
        for (var i = 0; i < outputLength; i++)
            output[i] = Convert.ToByte(input.Substring(i * 2, 2), 16);
        return output;
    }

Revision 4:

    public static byte[] HexadecimalStringToByteArray_Rev4(string input)
    {
        var outputLength = input.Length / 2;
        var output = new byte[outputLength];
        using (var sr = new StringReader(input))
        {
            for (var i = 0; i < outputLength; i++)
                output[i] = Convert.ToByte(new string(new char[2] { (char)sr.Read(), (char)sr.Read() }), 16);
        }
        return output;
    }

The revision avoids String.Substring and uses a StringReader instead. The given reason is:

Edit: you can improve performance for long strings by using a single pass parser, like so:

Well, looking at the reference code for String.Substring, it's clearly "single-pass" already; and why shouldn't it be? It operates at byte-level, not on surrogate pairs.

It does allocate a new string however, but then you need to allocate one to pass to Convert.ToByte anyway. Furthermore, the solution provided in the revision allocates yet another object on every iteration (the two-char array); you can safely put that allocation outside the loop and reuse the array to avoid that.

    public static byte[] HexadecimalStringToByteArray(string input)
    {
        var outputLength = input.Length / 2;
        var output = new byte[outputLength];
        var numeral = new char[2];
        using (var sr = new StringReader(input))
        {
            for (var i = 0; i < outputLength; i++)
            {
                numeral[0] = (char)sr.Read();
                numeral[1] = (char)sr.Read();
                output[i] = Convert.ToByte(new string(numeral), 16);
            }
        }
        return output;
    }

Each hexadecimal numeral represents a single octet using two digits (symbols).

But then, why call StringReader.Read twice? Just call its second overload and ask it to read two characters in the two-char array at once; and reduce the amount of calls by two.

    public static byte[] HexadecimalStringToByteArray(string input)
    {
        var outputLength = input.Length / 2;
        var output = new byte[outputLength];
        var numeral = new char[2];
        using (var sr = new StringReader(input))
        {
            for (var i = 0; i < outputLength; i++)
            {
                var read = sr.Read(numeral, 0, 2);
                Debug.Assert(read == 2);
                output[i] = Convert.ToByte(new string(numeral), 16);
            }
        }
        return output;
    }

What you're left with is a string reader whose only added "value" is a parallel index (internal _pos) which you could have declared yourself (as j for example), a redundant length variable (internal _length), and a redundant reference to the input string (internal _s). In other words, it's useless.

If you wonder how Read "reads", just look at the code, all it does is call String.CopyTo on the input string. The rest is just book-keeping overhead to maintain values we don't need.

So, remove the string reader already, and call CopyTo yourself; it's simpler, clearer, and more efficient.

    public static byte[] HexadecimalStringToByteArray(string input)
    {
        var outputLength = input.Length / 2;
        var output = new byte[outputLength];
        var numeral = new char[2];
        for (int i = 0, j = 0; i < outputLength; i++, j += 2)
        {
            input.CopyTo(j, numeral, 0, 2);
            output[i] = Convert.ToByte(new string(numeral), 16);
        }
        return output;
    }

Do you really need a j index that increments in steps of two parallel to i? Of course not, just multiply i by two (which the compiler should be able to optimize to an addition).

    public static byte[] HexadecimalStringToByteArray_BestEffort(string input)
    {
        var outputLength = input.Length / 2;
        var output = new byte[outputLength];
        var numeral = new char[2];
        for (int i = 0; i < outputLength; i++)
        {
            input.CopyTo(i * 2, numeral, 0, 2);
            output[i] = Convert.ToByte(new string(numeral), 16);
        }
        return output;
    }

What does the solution look like now? Exactly like it was at the beginning, only instead of using String.Substring to allocate the string and copy the data to it, you're using an intermediary array to which you copy the hexadecimal numerals to, then allocate the string yourself and copy the data again from the array and into the string (when you pass it in the string constructor). The second copy might be optimized-out if the string is already in the intern pool, but then String.Substring will also be able to avoid it in these cases.

In fact, if you look at String.Substring again, you see that it uses some low-level internal knowledge of how strings are constructed to allocate the string faster than you could normally do it, and it inlines the same code used by CopyTo directly in there to avoid the call overhead.

String.Substring

  • Worst-case: One fast allocation, one fast copy.
  • Best-case: No allocation, no copy.

Manual method

  • Worst-case: Two normal allocations, one normal copy, one fast copy.
  • Best-case: One normal allocation, one normal copy.

Conclusion? If you want to use Convert.ToByte(String, Int32) (because you don't want to re-implement that functionality yourself), there doesn't seem to be a way to beat String.Substring; all you do is run in circles, re-inventing the wheel (only with sub-optimal materials).

Note that using Convert.ToByte and String.Substring is a perfectly valid choice if you don't need extreme performance. Remember: only opt for an alternative if you have the time and resources to investigate how it works properly.

If there was a Convert.ToByte(char[], Int32), things would be different of course (it would be possible to do what I described above and completely avoid String).

I suspect that people who report better performance by "avoiding String.Substring" also avoid Convert.ToByte(String, Int32), which you should really be doing if you need the performance anyway. Look at the countless other answers to discover all the different approaches to do that.

Disclaimer: I haven't decompiled the latest version of the framework to verify that the reference source is up-to-date, I assume it is.

Now, it all sounds good and logical, hopefully even obvious if you've managed to get so far. But is it true?

Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz
    Cores: 8
    Current Clock Speed: 2600
    Max Clock Speed: 2600
--------------------
Parsing hexadecimal string into an array of bytes
--------------------
HexadecimalStringToByteArray_Original: 7,777.09 average ticks (over 10000 runs), 1.2X
HexadecimalStringToByteArray_BestEffort: 8,550.82 average ticks (over 10000 runs), 1.1X
HexadecimalStringToByteArray_Rev4: 9,218.03 average ticks (over 10000 runs), 1.0X

Yes!

Props to Partridge for the bench framework, it's easy to hack. The input used is the following SHA-1 hash repeated 5000 times to make a 100,000 bytes long string.

209113288F93A9AB8E474EA78D899AFDBB874355

Have fun! (But optimize with moderation.)

查看更多
初与友歌
5楼-- · 2018-12-30 23:59

Performance Analysis

Note: new leader as of 2015-08-20.

I ran each of the various conversion methods through some crude Stopwatch performance testing, a run with a random sentence (n=61, 1000 iterations) and a run with a Project Gutenburg text (n=1,238,957, 150 iterations). Here are the results, roughly from fastest to slowest. All measurements are in ticks (10,000 ticks = 1 ms) and all relative notes are compared to the [slowest] StringBuilder implementation. For the code used, see below or the test framework repo where I now maintain the code for running this.

Disclaimer

WARNING: Do not rely on these stats for anything concrete; they are simply a sample run of sample data. If you really need top-notch performance, please test these methods in an environment representative of your production needs with data representative of what you will use.

Results

Lookup tables have taken the lead over byte manipulation. Basically, there is some form of precomputing what any given nibble or byte will be in hex. Then, as you rip through the data, you simply look up the next portion to see what hex string it would be. That value is then added to the resulting string output in some fashion. For a long time byte manipulation, potentially harder to read by some developers, was the top-performing approach.

Your best bet is still going to be finding some representative data and trying it out in a production-like environment. If you have different memory constraints, you may prefer a method with fewer allocations to one that would be faster but consume more memory.

Testing Code

Feel free to play with the testing code I used. A version is included here but feel free to clone the repo and add your own methods. Please submit a pull request if you find anything interesting or want to help improve the testing framework it uses.

  1. Add the new static method (Func<byte[], string>) to /Tests/ConvertByteArrayToHexString/Test.cs.
  2. Add that method's name to the TestCandidates return value in that same class.
  3. Make sure you are running the input version you want, sentence or text, by toggling the comments in GenerateTestInput in that same class.
  4. Hit F5 and wait for the output (an HTML dump is also generated in the /bin folder).
static string ByteArrayToHexStringViaStringJoinArrayConvertAll(byte[] bytes) {
    return string.Join(string.Empty, Array.ConvertAll(bytes, b => b.ToString("X2")));
}
static string ByteArrayToHexStringViaStringConcatArrayConvertAll(byte[] bytes) {
    return string.Concat(Array.ConvertAll(bytes, b => b.ToString("X2")));
}
static string ByteArrayToHexStringViaBitConverter(byte[] bytes) {
    string hex = BitConverter.ToString(bytes);
    return hex.Replace("-", "");
}
static string ByteArrayToHexStringViaStringBuilderAggregateByteToString(byte[] bytes) {
    return bytes.Aggregate(new StringBuilder(bytes.Length * 2), (sb, b) => sb.Append(b.ToString("X2"))).ToString();
}
static string ByteArrayToHexStringViaStringBuilderForEachByteToString(byte[] bytes) {
    StringBuilder hex = new StringBuilder(bytes.Length * 2);
    foreach (byte b in bytes)
        hex.Append(b.ToString("X2"));
    return hex.ToString();
}
static string ByteArrayToHexStringViaStringBuilderAggregateAppendFormat(byte[] bytes) {
    return bytes.Aggregate(new StringBuilder(bytes.Length * 2), (sb, b) => sb.AppendFormat("{0:X2}", b)).ToString();
}
static string ByteArrayToHexStringViaStringBuilderForEachAppendFormat(byte[] bytes) {
    StringBuilder hex = new StringBuilder(bytes.Length * 2);
    foreach (byte b in bytes)
        hex.AppendFormat("{0:X2}", b);
    return hex.ToString();
}
static string ByteArrayToHexViaByteManipulation(byte[] bytes) {
    char[] c = new char[bytes.Length * 2];
    byte b;
    for (int i = 0; i < bytes.Length; i++) {
        b = ((byte)(bytes[i] >> 4));
        c[i * 2] = (char)(b > 9 ? b + 0x37 : b + 0x30);
        b = ((byte)(bytes[i] & 0xF));
        c[i * 2 + 1] = (char)(b > 9 ? b + 0x37 : b + 0x30);
    }
    return new string(c);
}
static string ByteArrayToHexViaByteManipulation2(byte[] bytes) {
    char[] c = new char[bytes.Length * 2];
    int b;
    for (int i = 0; i < bytes.Length; i++) {
        b = bytes[i] >> 4;
        c[i * 2] = (char)(55 + b + (((b - 10) >> 31) & -7));
        b = bytes[i] & 0xF;
        c[i * 2 + 1] = (char)(55 + b + (((b - 10) >> 31) & -7));
    }
    return new string(c);
}
static string ByteArrayToHexViaSoapHexBinary(byte[] bytes) {
    SoapHexBinary soapHexBinary = new SoapHexBinary(bytes);
    return soapHexBinary.ToString();
}
static string ByteArrayToHexViaLookupAndShift(byte[] bytes) {
    StringBuilder result = new StringBuilder(bytes.Length * 2);
    string hexAlphabet = "0123456789ABCDEF";
    foreach (byte b in bytes) {
        result.Append(hexAlphabet[(int)(b >> 4)]);
        result.Append(hexAlphabet[(int)(b & 0xF)]);
    }
    return result.ToString();
}
static readonly uint* _lookup32UnsafeP = (uint*)GCHandle.Alloc(_Lookup32, GCHandleType.Pinned).AddrOfPinnedObject();
static string ByteArrayToHexViaLookup32UnsafeDirect(byte[] bytes) {
    var lookupP = _lookup32UnsafeP;
    var result = new string((char)0, bytes.Length * 2);
    fixed (byte* bytesP = bytes)
    fixed (char* resultP = result) {
        uint* resultP2 = (uint*)resultP;
        for (int i = 0; i < bytes.Length; i++) {
            resultP2[i] = lookupP[bytesP[i]];
        }
    }
    return result;
}
static uint[] _Lookup32 = Enumerable.Range(0, 255).Select(i => {
    string s = i.ToString("X2");
    return ((uint)s[0]) + ((uint)s[1] << 16);
}).ToArray();
static string ByteArrayToHexViaLookupPerByte(byte[] bytes) {
    var result = new char[bytes.Length * 2];
    for (int i = 0; i < bytes.Length; i++)
    {
        var val = _Lookup32[bytes[i]];
        result[2*i] = (char)val;
        result[2*i + 1] = (char) (val >> 16);
    }
    return new string(result);
}
static string ByteArrayToHexViaLookup(byte[] bytes) {
    string[] hexStringTable = new string[] {
        "00", "01", "02", "03", "04", "05", "06", "07", "08", "09", "0A", "0B", "0C", "0D", "0E", "0F",
        "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "1A", "1B", "1C", "1D", "1E", "1F",
        "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "2A", "2B", "2C", "2D", "2E", "2F",
        "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "3A", "3B", "3C", "3D", "3E", "3F",
        "40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "4A", "4B", "4C", "4D", "4E", "4F",
        "50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "5A", "5B", "5C", "5D", "5E", "5F",
        "60", "61", "62", "63", "64", "65", "66", "67", "68", "69", "6A", "6B", "6C", "6D", "6E", "6F",
        "70", "71", "72", "73", "74", "75", "76", "77", "78", "79", "7A", "7B", "7C", "7D", "7E", "7F",
        "80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "8A", "8B", "8C", "8D", "8E", "8F",
        "90", "91", "92", "93", "94", "95", "96", "97", "98", "99", "9A", "9B", "9C", "9D", "9E", "9F",
        "A0", "A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9", "AA", "AB", "AC", "AD", "AE", "AF",
        "B0", "B1", "B2", "B3", "B4", "B5", "B6", "B7", "B8", "B9", "BA", "BB", "BC", "BD", "BE", "BF",
        "C0", "C1", "C2", "C3", "C4", "C5", "C6", "C7", "C8", "C9", "CA", "CB", "CC", "CD", "CE", "CF",
        "D0", "D1", "D2", "D3", "D4", "D5", "D6", "D7", "D8", "D9", "DA", "DB", "DC", "DD", "DE", "DF",
        "E0", "E1", "E2", "E3", "E4", "E5", "E6", "E7", "E8", "E9", "EA", "EB", "EC", "ED", "EE", "EF",
        "F0", "F1", "F2", "F3", "F4", "F5", "F6", "F7", "F8", "F9", "FA", "FB", "FC", "FD", "FE", "FF",
    };
    StringBuilder result = new StringBuilder(bytes.Length * 2);
    foreach (byte b in bytes) {
        result.Append(hexStringTable[b]);
    }
    return result.ToString();
}

Update (2010-01-13)

Added Waleed's answer to analysis. Quite fast.

Update (2011-10-05)

Added string.Concat Array.ConvertAll variant for completeness (requires .NET 4.0). On par with string.Join version.

Update (2012-02-05)

Test repo includes more variants such as StringBuilder.Append(b.ToString("X2")). None upset the results any. foreach is faster than {IEnumerable}.Aggregate, for instance, but BitConverter still wins.

Update (2012-04-03)

Added Mykroft's SoapHexBinary answer to analysis, which took over third place.

Update (2013-01-15)

Added CodesInChaos's byte manipulation answer, which took over first place (by a large margin on large blocks of text).

Update (2013-05-23)

Added Nathan Moinvaziri's lookup answer and the variant from Brian Lambert's blog. Both rather fast, but not taking the lead on the test machine I used (AMD Phenom 9750).

Update (2014-07-31)

Added @CodesInChaos's new byte-based lookup answer. It appears to have taken the lead on both the sentence tests and the full-text tests.

Update (2015-08-20)

Added airbreather's optimizations and unsafe variant to this answer's repo. If you want to play in the unsafe game, you can get some huge performance gains over any of the prior top winners on both short strings and large texts.

查看更多
长期被迫恋爱
6楼-- · 2018-12-31 00:00

Complement to answer by @CodesInChaos (reversed method)

public static byte[] HexToByteUsingByteManipulation(string s)
{
    byte[] bytes = new byte[s.Length / 2];
    for (int i = 0; i < bytes.Length; i++)
    {
        int hi = s[i*2] - 65;
        hi = hi + 10 + ((hi >> 31) & 7);

        int lo = s[i*2 + 1] - 65;
        lo = lo + 10 + ((lo >> 31) & 7) & 0x0f;

        bytes[i] = (byte) (lo | hi << 4);
    }
    return bytes;
}

Explanation:

& 0x0f is to support also lower case letters

hi = hi + 10 + ((hi >> 31) & 7); is the same as:

hi = ch-65 + 10 + (((ch-65) >> 31) & 7);

For '0'..'9' it is the same as hi = ch - 65 + 10 + 7; which is hi = ch - 48 (this is because of 0xffffffff & 7).

For 'A'..'F' it is hi = ch - 65 + 10; (this is because of 0x00000000 & 7).

For 'a'..'f' we have to big numbers so we must subtract 32 from default version by making some bits 0 by using & 0x0f.

65 is code for 'A'

48 is code for '0'

7 is the number of letters between '9' and 'A' in the ASCII table (...456789:;<=>?@ABCD...).

查看更多
梦醉为红颜
7楼-- · 2018-12-31 00:00

This version of ByteArrayToHexViaByteManipulation could be faster.

From my reports:

  • ByteArrayToHexViaByteManipulation3: 1,68 average ticks (over 1000 runs), 17,5X
  • ByteArrayToHexViaByteManipulation2: 1,73 average ticks (over 1000 runs), 16,9X
  • ByteArrayToHexViaByteManipulation: 2,90 average ticks (over 1000 runs), 10,1X
  • ByteArrayToHexViaLookupAndShift: 3,22 average ticks (over 1000 runs), 9,1X
  • ...

    static private readonly char[] hexAlphabet = new char[]
        {'0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F'};
    static string ByteArrayToHexViaByteManipulation3(byte[] bytes)
    {
        char[] c = new char[bytes.Length * 2];
        byte b;
        for (int i = 0; i < bytes.Length; i++)
        {
            b = ((byte)(bytes[i] >> 4));
            c[i * 2] = hexAlphabet[b];
            b = ((byte)(bytes[i] & 0xF));
            c[i * 2 + 1] = hexAlphabet[b];
        }
        return new string(c);
    }
    

And I think this one is an optimization:

    static private readonly char[] hexAlphabet = new char[]
        {'0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F'};
    static string ByteArrayToHexViaByteManipulation4(byte[] bytes)
    {
        char[] c = new char[bytes.Length * 2];
        for (int i = 0, ptr = 0; i < bytes.Length; i++, ptr += 2)
        {
            byte b = bytes[i];
            c[ptr] = hexAlphabet[b >> 4];
            c[ptr + 1] = hexAlphabet[b & 0xF];
        }
        return new string(c);
    }
查看更多
登录 后发表回答