Which .NET library has the fastest decompress performance (in terms of throughput)?
There are quite a few libraries out there...
- GZipStream
- DotNetZip
- Xceed Zip for .NET
- SevenZipLib
- SharpZipLib | community sponsor of Xceed Zip for .NET
...and I expect there are more I haven't listed.
Has anyone seen a benchmark of the throughput performance of these GZIP libraries? I'm interested in decompression throughput, but I'd like to see the results for compression too.
I've have had good performance with SevenZipLib for very large files, but I was using the native 7zip format and highly compressible content. If you're using content that won't have a high compression ratio, then your throughput will vary greatly compared to some of the benchmarks you can find for these libraries.
I found problems with Microsoft's GZipStream implementation not being able to read certain gzip files, so I have been testing a few libraries.
This is a basic test I adapted for you to run, tweak, and decide:
using System;
using System.Diagnostics;
using System.IO;
using System.IO.Compression;
using NUnit.Framework;
using Ionic.Zlib;
using ICSharpCode.SharpZipLib.GZip;
namespace ZipTests
{
[TestFixture]
public class ZipTests
{
MemoryStream input, compressed, decompressed;
Stream compressor;
int inputSize;
Stopwatch timer;
public ZipTests()
{
string testFile = "TestFile.pdf";
using(var file = File.OpenRead(testFile))
{
inputSize = (int)file.Length;
Console.WriteLine("Reading " + inputSize + " from " + testFile);
var ms = new MemoryStream(inputSize);
file.Read(ms.GetBuffer(), 0, inputSize);
ms.Position = 0;
input = ms;
}
compressed = new MemoryStream();
}
void StartCompression()
{
Console.WriteLine("Using " + compressor.GetType() + ":");
GC.Collect(2, GCCollectionMode.Forced); // Start fresh
timer = Stopwatch.StartNew();
}
public void EndCompression()
{
timer.Stop();
Console.WriteLine(" took " + timer.Elapsed
+ " to compress " + inputSize.ToString("#,0") + " bytes into "
+ compressed.Length.ToString("#,0"));
decompressed = new MemoryStream(inputSize);
compressed.Position = 0; // Rewind!
timer.Restart();
}
public void AfterDecompression()
{
timer.Stop();
Console.WriteLine(" then " + timer.Elapsed + " to decompress.");
Assert.AreEqual(inputSize, decompressed.Length);
Assert.AreEqual(input.GetBuffer(), decompressed.GetBuffer());
input.Dispose();
compressed.Dispose();
decompressed.Dispose();
}
[Test]
public void TestGZipStream()
{
compressor = new System.IO.Compression.GZipStream(compressed, System.IO.Compression.CompressionMode.Compress, true);
StartCompression();
compressor.Write(input.GetBuffer(), 0, inputSize);
compressor.Close();
EndCompression();
var decompressor = new System.IO.Compression.GZipStream(compressed, System.IO.Compression.CompressionMode.Decompress, true);
decompressor.CopyTo(decompressed);
AfterDecompression();
}
[Test]
public void TestDotNetZip()
{
compressor = new Ionic.Zlib.GZipStream(compressed, Ionic.Zlib.CompressionMode.Compress, true);
StartCompression();
compressor.Write(input.GetBuffer(), 0, inputSize);
compressor.Close();
EndCompression();
var decompressor = new Ionic.Zlib.GZipStream(compressed,
Ionic.Zlib.CompressionMode.Decompress, true);
decompressor.CopyTo(decompressed);
AfterDecompression();
}
[Test]
public void TestSharpZlib()
{
compressor = new ICSharpCode.SharpZipLib.GZip.GZipOutputStream(compressed)
{ IsStreamOwner = false };
StartCompression();
compressor.Write(input.GetBuffer(), 0, inputSize);
compressor.Close();
EndCompression();
var decompressor = new ICSharpCode.SharpZipLib.GZip.GZipInputStream(compressed);
decompressor.CopyTo(decompressed);
AfterDecompression();
}
static void Main()
{
Console.WriteLine("Running CLR version " + Environment.Version +
" on " + Environment.OSVersion);
Assert.AreEqual(1,1); // Preload NUnit
new ZipTests().TestGZipStream();
new ZipTests().TestDotNetZip();
new ZipTests().TestSharpZlib();
}
}
}
And the result in the system I am currently running (Mono on Linux), is as follows:
Running Mono CLR version 4.0.30319.1 on Unix 3.2.0.29
Reading 37711561 from /home/agustin/Incoming/ZipTests/TestFile.pdf
Using System.IO.Compression.GZipStream:
took 00:00:03.3058572 to compress 37,711,561 bytes into 33,438,894
then 00:00:00.5331546 to decompress.
Reading 37711561 from /home/agustin/Incoming/ZipTests/TestFile.pdf
Using Ionic.Zlib.GZipStream:
took 00:00:08.9531478 to compress 37,711,561 bytes into 33,437,891
then 00:00:01.8047543 to decompress.
Reading 37711561 from /home/agustin/Incoming/ZipTests/TestFile.pdf
Using ICSharpCode.SharpZipLib.GZip.GZipOutputStream:
took 00:00:07.4982231 to compress 37,711,561 bytes into 33,431,962
then 00:00:02.4157496 to decompress.
Be warned that this is Mono's GZIP, and Microsoft's version will give its own results (and as I mentioned, just can't handle any gzip you give it)
This is what I got on a windows system:
Running CLR version 4.0.30319.1 on Microsoft Windows NT 5.1.2600 Service Pack 3
Reading 37711561 from TestFile.pdf
Using System.IO.Compression.GZipStream:
took 00:00:03.3557061 to compress 37.711.561 bytes into 36.228.969
then 00:00:00.7079438 to decompress.
Reading 37711561 from TestFile.pdf
Using Ionic.Zlib.GZipStream:
took 00:00:23.4180958 to compress 37.711.561 bytes into 33.437.891
then 00:00:03.5955664 to decompress.
Reading 37711561 from TestFile.pdf
Using ICSharpCode.SharpZipLib.GZip.GZipOutputStream:
took 00:00:09.9157130 to compress 37.711.561 bytes into 33.431.962
then 00:00:03.0983499 to decompress.
It is easy enough to add more tests...
Compression performance benchmarks vary based on the size of streams being compressed and the precise content. If this is a particularly important performance bottleneck for you then it'd be worth your time to write a sample app using each library and running tests with your real files.