How do I calculate the equivalent to SQL Server (h

2019-01-12 04:57发布

问题:

In my database I have a computed column that contains a SHA1 hash of a column called URLString which holds URLs (e.g. "http://xxxx.com/index.html").

I often need to query the table to find a specific URL based on the URLString column. The table contains 100K's and these queries take several seconds (using SQL Azure). Since URLs can be quite long, I cannot create an index on this column (above 450 bytes).

To speed things up I want to calculate the equivalent of SQL Server hashbytes('SHA1',[URLString]) from C# and query based on this value.

I tried the below code, but the value I get is different than the one calculated by the database.

var urlString = Encoding.ASCII.GetBytes(url.URLString); //UTF8 also fails
var sha1 = new SHA1CryptoServiceProvider();
byte[] hash = sha1.ComputeHash(urlString);

Am I missing something trivial here?
I'm open to other ideas that can solve the same problem (as long as they are supported by SQL Azure).

Example: in the database the automatically calculated SHA1 value of URL http://www.whatismyip.org/ is 0xAE66CA69A157186A511ED462153D7CA65F0C1BF7.

回答1:

You're likely getting bitten by character encoding differences:

http://weblogs.sqlteam.com/mladenp/archive/2009/04/28/Comparing-SQL-Server-HASHBYTES-function-and-.Net-hashing.aspx

You could try getting the bytes via Encoding.ASCII.GetBytes(url) or Encoding.Unicode.GetBytes(url) and see which one your db is using.



回答2:

Below are two methods that do hashing of string and of bytes. The HashBytes method returns Base64 of the resulting bytes but you can return just the bytes if you prefer them

public static string HashString(string cleartext)
{
    byte[] clearBytes = Encoding.UTF8.GetBytes(cleartext);
    return HashBytes(clearBytes);
}  

public static string HashBytes(byte[] clearBytes)
{
    SHA1 hasher = SHA1.Create();
    byte[] hashBytes =   hasher.ComputeHash(clearBytes);
    string hash = System.Convert.ToBase64String(hashBytes);
    hasher.Clear();
    return hash;
}


回答3:

The below code is equivalent to SQL Server's hashbytes('sha1')

using (SHA1Managed sha1 = new SHA1Managed()) {
    var hash = sha1.ComputeHash(Encoding.Unicode.GetBytes(input));
    var sb = new StringBuilder(hash.Length * 2);

    foreach (byte b in hash) {
        // can be "x2" if you want lowercase
        sb.Append(b.ToString("X2"));
    }

    string output = sb.ToString();
}