What's the best way to split a multi-page TIFF with python? PIL doesn't seem to have support for multi-page images, and I haven't found an exact port for libtiff for python. Would PyLibTiff be the way to go? Can somebody provide a simple example of how I could parse multiple pages within a TIFF?
问题:
回答1:
I do use ImageMagick as external program to convert multi-page fax into viewable PNGs:
/usr/bin/convert /var/voip/fax/out/2012/04/fax_out_L1_17.tiff[0] -scale 50x100% -depth 16 /tmp/fax_images/fax_out_L1_17-0-m.png
does convert first page to PNG
aaa.tiff[1] would be second page, and so on.
Or to extract all images, do:
convert -verbose fax_in_L1-1333564876.469.tiff a.png
fax_in_L1-1333564876.469.tiff[0] TIFF 1728x1078 1728x1078+0+0 1-bit Bilevel DirectClass 109KiB 0.030u 0:00.030
fax_in_L1-1333564876.469.tiff[1] TIFF 1728x1078 1728x1078+0+0 1-bit Bilevel DirectClass 109KiB 0.020u 0:00.010
fax_in_L1-1333564876.469.tiff[2] TIFF 1728x1078 1728x1078+0+0 1-bit Bilevel DirectClass 109KiB 0.020u 0:00.010
fax_in_L1-1333564876.469.tiff=>a-0.png[0] TIFF 1728x1078 1728x1078+0+0 1-bit Bilevel DirectClass 12KiB 0.030u 0:00.019
fax_in_L1-1333564876.469.tiff=>a-1.png[1] TIFF 1728x1078 1728x1078+0+0 1-bit Bilevel DirectClass 8KiB 0.040u 0:00.039
fax_in_L1-1333564876.469.tiff=>a-2.png[2] TIFF 1728x1078 1728x1078+0+0 1-bit Bilevel DirectClass 32KiB 0.070u 0:00.070
So, to just split one multi-page TIFF into many-page TIFF you would have to execute:
convert in-12345.tiff /tmp/out-12345.tiff
and then work with temporary files: /tmp/out-12345-*.tiff
However ImageMagick can do a lot of processing, so you can probably achieve your desired result in one command.
回答2:
A project (disclosure: which I am one of the main authors, this question was one of the things that prompted me to work on it) which makes this is easy is PIMS. The core of PIMS is essentially a cleaned up and generalized version of the following class.
A class to do basic frame extraction + simple iteration.
import PIL.Image
class Stack_wrapper(object):
def __init__(self,fname):
'''fname is the full path '''
self.im = PIL.Image.open(fname)
self.im.seek(0)
# get image dimensions from the meta data the order is flipped
# due to row major v col major ordering in tiffs and numpy
self.im_sz = [self.im.tag[0x101][0],
self.im.tag[0x100][0]]
self.cur = self.im.tell()
def get_frame(self,j):
'''Extracts the jth frame from the image sequence.
if the frame does not exist return None'''
try:
self.im.seek(j)
except EOFError:
return None
self.cur = self.im.tell()
return np.reshape(self.im.getdata(),self.im_sz)
def __iter__(self):
self.im.seek(0)
self.old = self.cur
self.cur = self.im.tell()
return self
def next(self):
try:
self.im.seek(self.cur)
self.cur = self.im.tell()+1
except EOFError:
self.im.seek(self.old)
self.cur = self.im.tell()
raise StopIteration
return np.reshape(self.im.getdata(),self.im_sz)
回答3:
Imagemagick worked for me real good. Wnen splitting a tiff file, basically converting from tiff to tiff, one can use a flag to force saving output files to individual tiff files. To do that, try
convert input.tif output-%d.tif
The %d operator is a C-Printf style %d. So, if you need a 3 field running sequence, you can say
convert input.tif output-%3d.tif
and so on.. %d is replaced by "scene" number of the image. Now, scene numbers may or may not always start with 0 (or 1, if you want it that way). To setup a sequence the way you want, try
convert input.tif -scene 1 output-%3d.tif
This would start the sequence right from the count you provided.
convert -scene 1 input.TIF output-%d.TIF
output-1.TIF
output-2.TIF
output-3.TIF
Magick indeed!! :)
This link to documentation has more details. This works on my windows machine too.
回答4:
The following splits a tif file with multiple frames into tif files where each file is one frame.
def parse_tif(filePath):
img = Image.open(filePath)
for i in range (numFramesPerTif):
try:
img.seek(i)
img.save('Block_%s.tif'%(i,))
except EOFError: #end of file error
回答5:
You could convert it to PDF and use pyPDF to split the pages