I have to merge multiple 1 page pdf's into one pdf. I'm using iTextSHarp 5.5.5.0 to accomplish this, but when I get to merge more than 900-1000 pdf I get an out of memory exception. I noticed that even if I free my reader and close it the memory never gets cleaned properly (the amount of memory used by the process never decreases)so I was wondering what I could possibly be doing wrong. This is my code:
using (MemoryStream msOutput = new MemoryStream())
{
Document doc = new Document();
PdfSmartCopy pCopy = new PdfSmartCopy(doc, msOutput);
doc.Open();
foreach (Tuple<string, int> file in filesList)
{
PdfReader pdfFile = new PdfReader(file.Item1);
for (int j = 0; j < file.Item2; j++)
for (int i = 1; i < pdfFile.NumberOfPages + 1; i++)//in this case it's always 1.
pCopy.AddPage(pCopy.GetImportedPage(pdfFile, i));
pCopy.FreeReader(pdfFile);
pdfFile.Close();
File.Delete(file.Item1);
}
pCopy.Close();
doc.Close();
byte[] content = msOutput.ToArray();
using (FileStream fs = File.Create(Out))
{
fs.Write(content, 0, content.Length);
}
}
It never gets to writing the file, I get an out of memory exception during the p.Copy().AddPage() part. I even tried flushing the pCopy variable but didn't change anything. I looked in the documentation of iText and various questions around StackOverflow but seems to me that I'm taking every suggestion to keep memory usage low, but this isn't happening. Any ideas on this?
Since this is a large amount of stuff I'd recommend writing directly to a
FileStream
instead of aMemoryStream
. This might be an actual case where an Out of Memory Exception might literally mean "Out of Memory".Also, as Bruno pointed out, the "smart" part of
PdfSmartCopy
unfortunately comes at the cost of memory, too. Switching toPdfCopy
should reduce memory pressure although your final PDF might be larger.