Compute a hash from a stream of unknown length in

2019-01-09 09:09发布

问题:

What is the best solution in C# for computing an "on the fly" md5 like hash of a stream of unknown length? Specifically, I want to compute a hash from data received over the network. I know I am done receiving data when the sender terminates the connection, so I don't know the length in advance.

[EDIT] - Right now I am using md5, but this requires a second pass over the data after it's been saved and written to disk. I'd rather hash it in place as it comes in from the network.

回答1:

MD5, like other hash functions, does not require two passes.

To start:

HashAlgorithm hasher = ..;
hasher.Initialize();

As each block of data arrives:

byte[] buffer = ..;
int bytesReceived = ..;
hasher.TransformBlock(buffer, 0, bytesReceived, null, 0);

To finish and retrieve the hash:

hasher.TransformFinalBlock(new byte[0], 0, 0);
byte[] hash = hasher.Hash;

This pattern works for any type derived from HashAlgorithm, including MD5CryptoServiceProvider and SHA1Managed.

HashAlgorithm also defines a method ComputeHash which takes a Stream object; however, this method will block the thread until the stream is consumed. Using the TransformBlock approach allows an "asynchronous hash" that is computed as data arrives without using up a thread.



回答2:

The System.Security.Cryptography.MD5 class contains a ComputeHash method that takes either a byte[] or Stream. Check it out at http://msdn.microsoft.com/en-us/library/system.security.cryptography.md5_members.aspx



回答3:

Further to @peter-mourfield 's answer, here is the code that uses ComputeHash():

private static string CalculateMd5(string filePathName) {
   using (var stream = File.OpenRead(filePathName))
   using (var md5 = MD5.Create()) {
   var hash = md5.ComputeHash(stream);
   var base64String = Convert.ToBase64String(hash);
   return base64String;
   }
}

Since both the stream as well as MD5 implement IDisposible, you need to use using(...){...}

The method in the code example returns the same string that is used for the MD5 checksum in Azure Blob Storage.



回答4:

Necromancing.

Two possibilitites in C# .NET Core:

private static System.Security.Cryptography.HashAlgorithm GetHashAlgorithm(System.Security.Cryptography.HashAlgorithmName hashAlgorithmName)
{
    if (hashAlgorithmName == System.Security.Cryptography.HashAlgorithmName.MD5)
        return (System.Security.Cryptography.HashAlgorithm) System.Security.Cryptography.MD5.Create();
    if (hashAlgorithmName == System.Security.Cryptography.HashAlgorithmName.SHA1)
        return (System.Security.Cryptography.HashAlgorithm) System.Security.Cryptography.SHA1.Create();
    if (hashAlgorithmName == System.Security.Cryptography.HashAlgorithmName.SHA256)
        return (System.Security.Cryptography.HashAlgorithm) System.Security.Cryptography.SHA256.Create();
    if (hashAlgorithmName == System.Security.Cryptography.HashAlgorithmName.SHA384)
        return (System.Security.Cryptography.HashAlgorithm) System.Security.Cryptography.SHA384.Create();
    if (hashAlgorithmName == System.Security.Cryptography.HashAlgorithmName.SHA512)
        return (System.Security.Cryptography.HashAlgorithm) System.Security.Cryptography.SHA512.Create();

    throw new System.Security.Cryptography.CryptographicException($"Unknown hash algorithm \"{hashAlgorithmName.Name}\".");
}


protected override byte[] HashData(System.IO.Stream data,
    System.Security.Cryptography.HashAlgorithmName hashAlgorithm)
{
    using (System.Security.Cryptography.HashAlgorithm hashAlgorithm1 = 
    GetHashAlgorithm(hashAlgorithm))
    return hashAlgorithm1.ComputeHash(data);
}

or with BouncyCastle:

private static Org.BouncyCastle.Crypto.IDigest GetBouncyAlgorithm(
    System.Security.Cryptography.HashAlgorithmName hashAlgorithmName)
{
    if (hashAlgorithmName == System.Security.Cryptography.HashAlgorithmName.MD5)
        return new Org.BouncyCastle.Crypto.Digests.MD5Digest();
    if (hashAlgorithmName == System.Security.Cryptography.HashAlgorithmName.SHA1)
        return new Org.BouncyCastle.Crypto.Digests.Sha1Digest();
    if (hashAlgorithmName == System.Security.Cryptography.HashAlgorithmName.SHA256)
        return new Org.BouncyCastle.Crypto.Digests.Sha256Digest();
    if (hashAlgorithmName == System.Security.Cryptography.HashAlgorithmName.SHA384)
        return new Org.BouncyCastle.Crypto.Digests.Sha384Digest();
    if (hashAlgorithmName == System.Security.Cryptography.HashAlgorithmName.SHA512)
        return new Org.BouncyCastle.Crypto.Digests.Sha512Digest();

    throw new System.Security.Cryptography.CryptographicException(
        $"Unknown hash algorithm \"{hashAlgorithmName.Name}\"."
    );
} // End Function GetBouncyAlgorithm  



protected override byte[] HashData(System.IO.Stream data,
    System.Security.Cryptography.HashAlgorithmName hashAlgorithm)
{
    Org.BouncyCastle.Crypto.IDigest digest = GetBouncyAlgorithm(hashAlgorithm);

    byte[] buffer = new byte[4096];
    int cbSize;
    while ((cbSize = data.Read(buffer, 0, buffer.Length)) > 0)
        digest.BlockUpdate(buffer, 0, cbSize);

    byte[] hash = new byte[digest.GetDigestSize()];
    digest.DoFinal(hash, 0);
    return hash;
}