I need a fast PDF Compression library for .NET that will allow me to run 10 concurrent threads each compressing a separate PDF file to around 10% of its original size. Any suggestions? (I have already tried out the product from neeviaPDF.com. It is not as fast as I need.)
问题:
回答1:
The company's website shows three examples - one compresses a pdf from 9.1mb to 133kb. Opening them up with Notepad shows a single 2500x3000 mostly black image compressed with FlateDecode converted to the same size image compressed with JPEG2000. This kind of compression ratio is probably the best-case scenario. The other two examples are more reasonable; 741kb to 349kb and 940kb to 804kb. They also include a screenshot of the settings; one checked in all three examples contains a warning: "VERY SLOW!!!" Seems like a good product, though. It does all the right things, including web optimization.
10% of the original is unlikely unless your pdfs' contents are known ahead of time, heavy in images, and you handcode a solution using iTextSharp to take advantage of the way the pdfs are put together.
If you like the way the component you have works, and it is not thread safe, why not just create 10 separate processes with it? If you've got a lot of large images, be careful of out-of-memory errors.
回答2:
Try Apago's PDFshrink. It's a commercial product and supports PDF compression using multi-core CPUs.
回答3:
Morovia's PDFLeo is capable of compressing PDF into small size. It employs two major techniques - data stream compression and object streams. According to its manual:
- Removing unused objects. Unused objects will be discarded. If a PDF is produced through incremental update, many objects are not needed. Incremental update is a feature to allow a processing application to append changes at the file end without removing prior object definitions. This technique reduces the memory usage at the cost of bigger file size.
- Writing objects in a compact syntax. PDFLeo writes output using compact syntax Extra white spaces are removed. Hexadecimal strings are written with more compact binary representations.
- Compressed streams. When specified, pdfleo compresses all streams except those who must be kept intact.
- Object streams. Non stream objects can be placed into a special object stream and compressed.