I have an inputStream
that I want to use to compute a hash and save the file to disk. I would like to know how to do that efficiently. Should I use some task to do that concurrently, should I duplicate the stream pass to two streams, one for the the saveFile
method and one for thecomputeHash
method, or should I do something else?
问题:
回答1:
What about using a hash algorithms that operate on a block level? You can add the block to the hash (using the TransformBlock) and subsequently write the block to the file foreach block in the stream.
Untested rough shot:
using System.IO;
using System.Security.Cryptography;
...
public byte[] HashedFileWrite(string filename, Stream input)
{
var hash_algorithm = MD5.Create();
using(var file = File.OpenWrite(filename))
{
byte[] buffer = new byte[4096];
int read = 0;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
hash_algorithm.TransformBlock(buffer, 0, read, null, 0);
file.Write(buffer, 0, read);
}
hash_algorithm.TransformFinalBlock(buffer, 0, read);
}
return hash_algorithm.Hash;
}
回答2:
This method will copy and hash with chained streams.
private static byte[] CopyAndHash(string source, string target, Action<double> progress, Func<bool> isCanceled)
{
using(var sha512 = SHA512.Create())
using (var targetStream = File.OpenWrite(target))
using (var cryptoStream = new CryptoStream(targetStream, sha512, CryptoStreamMode.Write))
using (var sourceStream = File.OpenRead(source))
{
byte[] buffer = new byte[81920];
int read;
while ((read = sourceStream.Read(buffer, 0, buffer.Length)) > 0 && !isCanceled())
{
cryptoStream.Write(buffer, 0, read);
progress?.Invoke((double) sourceStream.Length / sourceStream.Position * 100);
}
File.SetAttributes(target, File.GetAttributes(source));
return sha512.Hash;
}
}
Full sample see https://gist.github.com/dhcgn/da1637277d9456db9523a96a0a34da78
回答3:
It might not be the best option, but I would opt to go for Stream
descendant/wrapper, the one that would be pass-through for one actually writing the file to the disk.
So:
- derive from
Stream
- have one member such as
Stream _inner;
that will be the target stream to write - implement
Write()
and all related stuff - in
Write()
hash the blocks of data and call_inner.Write()
Usage example
Stream s = File.Open("infile.dat");
Stream out = File.Create("outfile.dat");
HashWrapStream hasher = new HashWrapStream(out);
byte[] buffer=new byte[1024];
int read = 0;
while ((read=s.Read(buffer)!=0)
{
hasher.Write(buffer);
}
long hash=hasher.GetComputedHash(); // get actual hash
hasher.Dispose();
s.Dispose();
回答4:
You'll need to stuff the stream's bytes into a byte[]
in order to hash them.
回答5:
Here is my solution, it writes an array of structs (the ticks variable) as a csv file (using the CsvHelper nuget package) and then creates a hash for checksum purposes using the suffix .sha256
I do this by writing the csv to a memoryStream, then writing the memory stream to disk, then passing the memorystream to the hash algo.
This solution is keeping the entire file around as a memorystream. It's fine for everything except multi-gigabyte files that would run you out of ram. If I had to do this again, I'd probably try using CryptoStream approach, but this is good enough for my foreseeable purposes.
I have verified via a 3rd party tool that the hashes are valid.
Here is the code:
//var ticks = **some_array_you_want_to_write_as_csv**
using (var memoryStream = new System.IO.MemoryStream())
{
using (var textWriter = new System.IO.StreamWriter(memoryStream))
{
using (var csv = new CsvHelper.CsvWriter(textWriter))
{
csv.Configuration.DetectColumnCountChanges = true; //error checking
csv.Configuration.RegisterClassMap<TickDataClassMap>();
csv.WriteRecords(ticks);
textWriter.Flush();
//write to disk
using (var fileStream = new System.IO.FileStream(targetFileName, System.IO.FileMode.Create))
{
memoryStream.Position = 0;
memoryStream.CopyTo(fileStream);
}
//write sha256 hash, ensuring that the file was properly written
using (var sha256 = System.Security.Cryptography.SHA256.Create())
{
memoryStream.Position = 0;
var hash = sha256.ComputeHash(memoryStream);
using (var reader = System.IO.File.OpenRead(targetFileName))
{
System.IO.File.WriteAllText(targetFileName + ".sha256", hash.ConvertByteArrayToHexString());
}
}
}
}
}