To my surprise the folowing method produces a different result in debug vs release:
int result = "test".GetHashCode();
Is there any way to avoid this?
I need a reliable way to hash a string and I need the value to be consistent in debug and release mode. I would like to avoid writing my own hashing function if possible.
Why does this happen?
FYI, reflector gives me:
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail), SecuritySafeCritical]
public override unsafe int GetHashCode()
{
fixed (char* str = ((char*) this))
{
char* chPtr = str;
int num = 0x15051505;
int num2 = num;
int* numPtr = (int*) chPtr;
for (int i = this.Length; i > 0; i -= 4)
{
num = (((num << 5) + num) + (num >> 0x1b)) ^ numPtr[0];
if (i <= 2)
{
break;
}
num2 = (((num2 << 5) + num2) + (num2 >> 0x1b)) ^ numPtr[1];
numPtr += 2;
}
return (num + (num2 * 0x5d588b65));
}
}
GetHashCode()
is not what you should be using to hash a string, almost 100% of the time. Without knowing what you're doing, I recommend that you use an actual hash algorithm, like SHA-1:
using(System.Security.Cryptography.SHA1Managed hp = new System.Security.Cryptography.SHA1Managed()) {
// Use hp.ComputeHash(System.Text.Encoding.ASCII (or Unicode, UTF8, UTF16, or UTF32 or something...).GetBytes(theString) to compute the hash code.
}
Update: For something a little bit faster, there's also SHA1Cng
, which is significantly faster than SHA1Managed
.
Here's a better approach that is much faster than SHA and you can replace the modified GetHasCode with it: C# fast hash murmur2
There are several implementations with different levels of "unmanaged" code, so if you need fully managed it's there and if you can use unsafe it's there too.
/// <summary>
/// Default implementation of string.GetHashCode is not consistent on different platforms (x32/x64 which is our case) and frameworks.
/// FNV-1a - (Fowler/Noll/Vo) is a fast, consistent, non-cryptographic hash algorithm with good dispersion. (see http://isthe.com/chongo/tech/comp/fnv/#FNV-1a)
/// </summary>
private static int GetFNV1aHashCode(string str)
{
if (str == null)
return 0;
var length = str.Length;
// original FNV-1a has 32 bit offset_basis = 2166136261 but length gives a bit better dispersion (2%) for our case where all the strings are equal length, for example: "3EC0FFFF01ECD9C4001B01E2A707"
int hash = length;
for (int i = 0; i != length; ++i)
hash = (hash ^ str[i]) * 16777619;
return hash;
}
I guess this implementation is slower than the unsafe one posted here. But it's much simpler and safe. Works good in case super speed is not needed.