I have a 453MB XML file which I'm trying to compress to a ZIP using SharpZipLib.
Below is the code I'm using to create the zip, but it's causing an OutOfMemoryException
. This code successfully compresses a file of 428MB.
Any idea why the exception is happening, as I can't see why, as my system has plenty of memory available.
public void CompressFiles(List<string> pathnames, string zipPathname)
{
try
{
using (FileStream stream = new FileStream(zipPathname, FileMode.Create, FileAccess.Write, FileShare.None))
{
using (ZipOutputStream stream2 = new ZipOutputStream(stream))
{
foreach (string str in pathnames)
{
FileStream stream3 = new FileStream(str, FileMode.Open, FileAccess.Read, FileShare.Read);
byte[] buffer = new byte[stream3.Length];
try
{
if (stream3.Read(buffer, 0, buffer.Length) != buffer.Length)
{
throw new Exception(string.Format("Error reading '{0}'.", str));
}
}
finally
{
stream3.Close();
}
ZipEntry entry = new ZipEntry(Path.GetFileName(str));
stream2.PutNextEntry(entry);
stream2.Write(buffer, 0, buffer.Length);
}
stream2.Finish();
}
}
}
catch (Exception)
{
File.Delete(zipPathname);
throw;
}
}
You're trying to create a buffer as big as the file. Instead, make the buffer a fixed size, read some bytes into it, and write the number of read bytes into the zip file.
Here's your code with a buffer of 4096 bytes (and some cleanup):
Especially note this block:
This reads bytes into
buffer
. When there are no more bytes, theRead()
method returns 0, so that's when we stop. WhenRead()
succeeds, we can be sure there is some data in the buffer but we don't know how many bytes. The whole buffer might be filled, or just a small portion of it. Therefore, we use the number of read bytesread
to determine how many bytes to write to theZipOutputStream
.That block of code, by the way, can be replaced by a simple statement that was added to .Net 4.0, which does exactly the same:
So, your code could become:
And now you know why you got the error, and how to use buffers.
You're allocating a lot of memory for no good reason, and I bet you have a 32-bit process. 32-bit processes can only allocate up to 2GB of virtual memory in normal conditions, and the library surely allocates memory too.
Anyway, several things are wrong here:
byte[] buffer = new byte[stream3.Length];
Why? You don't need to store the whole thing in memory to process it.
if (stream3.Read(buffer, 0, buffer.Length) != buffer.Length)
This one is nasty.
Stream.Read
is explicitly allowed to return less bytes than what you asked for, and this is still a valid result. When reading a stream into a buffer you have to callRead
repeatedly until the buffer is filled or the end of the stream is reached.Your variables should have more meaningful names. You can easily get lost with these
stream2
,stream3
etc.A simple solution would be: