I want to convert some multi-pages .tif or .pdf files to individual .png images. From command line (using ImageMagick) I just do:
convert multi_page.pdf file_out.png
And I get all the pages as individual images (file_out-0.png, file_out-1.png, ...)
I would like to handle this file conversion within Python, unfortunately PIL cannot read .pdf files, so I want to use PythonMagick. I tried:
import PythonMagick
im = PythonMagick.Image('multi_page.pdf')
im.write("file_out%d.png")
or just
im.write("file_out.png")
But I only get 1 page converted to png.
Of course I could load each pages individually and convert them one by one. But there must be a way to do them all at once?
ImageMagick is not memory efficient, so if you try to read a large pdf, like 100 pages or so, the memory requirement will be huge and it might crash or seriously slow down your system. So after all reading all pages at once with PythonMagick is a bad idea, its not safe.
So for pdfs, I ended up doing it page by page, but for that I need to get the number of pages first using pyPdf, its reasonably fast:
pdf_im = pyPdf.PdfFileReader(file('multi_page.pdf', "rb"))
npage = pdf_im.getNumPages()
for p in npage:
im = PythonMagick.Image('multi_page.pdf['+ str(p) +']')
im.write('file_out-' + str(p)+ '.png')
A more complete example based on the answer by Ivo Flipse and http://p-s.co.nz/wordpress/pdf-to-png-using-pythonmagick/
This uses a higher resolution and uses PyPDF2 instead of older pyPDF.
import sys
import PyPDF2
import PythonMagick
pdffilename = sys.argv[1]
pdf_im = PyPDF2.PdfFileReader(file(pdffilename, "rb"))
npage = pdf_im.getNumPages()
print('Converting %d pages.' % npage)
for p in range(npage):
im = PythonMagick.Image()
im.density('300')
im.read(pdffilename + '[' + str(p) +']')
im.write('file_out-' + str(p)+ '.png')
I had the same problem and as a work around i used ImageMagick and did
import subprocess
params = ['convert', 'src.pdf', 'out.png']
subprocess.check_call(params)