Split multi-page tiff with python

2019-01-24 13:48发布

问题:

What's the best way to split a multi-page TIFF with python? PIL doesn't seem to have support for multi-page images, and I haven't found an exact port for libtiff for python. Would PyLibTiff be the way to go? Can somebody provide a simple example of how I could parse multiple pages within a TIFF?

回答1:

I do use ImageMagick as external program to convert multi-page fax into viewable PNGs:

/usr/bin/convert /var/voip/fax/out/2012/04/fax_out_L1_17.tiff[0] -scale 50x100% -depth 16 /tmp/fax_images/fax_out_L1_17-0-m.png

does convert first page to PNG

aaa.tiff[1] would be second page, and so on.

Or to extract all images, do:

convert -verbose fax_in_L1-1333564876.469.tiff a.png
fax_in_L1-1333564876.469.tiff[0] TIFF 1728x1078 1728x1078+0+0 1-bit Bilevel DirectClass 109KiB 0.030u 0:00.030
fax_in_L1-1333564876.469.tiff[1] TIFF 1728x1078 1728x1078+0+0 1-bit Bilevel DirectClass 109KiB 0.020u 0:00.010
fax_in_L1-1333564876.469.tiff[2] TIFF 1728x1078 1728x1078+0+0 1-bit Bilevel DirectClass 109KiB 0.020u 0:00.010
fax_in_L1-1333564876.469.tiff=>a-0.png[0] TIFF 1728x1078 1728x1078+0+0 1-bit Bilevel DirectClass 12KiB 0.030u 0:00.019
fax_in_L1-1333564876.469.tiff=>a-1.png[1] TIFF 1728x1078 1728x1078+0+0 1-bit Bilevel DirectClass 8KiB 0.040u 0:00.039
fax_in_L1-1333564876.469.tiff=>a-2.png[2] TIFF 1728x1078 1728x1078+0+0 1-bit Bilevel DirectClass 32KiB 0.070u 0:00.070

So, to just split one multi-page TIFF into many-page TIFF you would have to execute:

convert in-12345.tiff /tmp/out-12345.tiff

and then work with temporary files: /tmp/out-12345-*.tiff

However ImageMagick can do a lot of processing, so you can probably achieve your desired result in one command.



回答2:

A project (disclosure: which I am one of the main authors, this question was one of the things that prompted me to work on it) which makes this is easy is PIMS. The core of PIMS is essentially a cleaned up and generalized version of the following class.

A class to do basic frame extraction + simple iteration.

import PIL.Image
class Stack_wrapper(object):
    def __init__(self,fname):
        '''fname is the full path '''
        self.im  = PIL.Image.open(fname)

        self.im.seek(0)
        # get image dimensions from the meta data the order is flipped
        # due to row major v col major ordering in tiffs and numpy
        self.im_sz = [self.im.tag[0x101][0],
                      self.im.tag[0x100][0]]
        self.cur = self.im.tell()

    def get_frame(self,j):
        '''Extracts the jth frame from the image sequence.
        if the frame does not exist return None'''
        try:
            self.im.seek(j)
        except EOFError:
            return None

        self.cur = self.im.tell()
        return np.reshape(self.im.getdata(),self.im_sz)
    def __iter__(self):
        self.im.seek(0)
        self.old = self.cur
        self.cur = self.im.tell()
        return self

    def next(self):
        try:
            self.im.seek(self.cur)
            self.cur = self.im.tell()+1
        except EOFError:
            self.im.seek(self.old)
            self.cur = self.im.tell()
            raise StopIteration
        return np.reshape(self.im.getdata(),self.im_sz)


回答3:

Imagemagick worked for me real good. Wnen splitting a tiff file, basically converting from tiff to tiff, one can use a flag to force saving output files to individual tiff files. To do that, try

convert input.tif output-%d.tif

The %d operator is a C-Printf style %d. So, if you need a 3 field running sequence, you can say

convert input.tif output-%3d.tif

and so on.. %d is replaced by "scene" number of the image. Now, scene numbers may or may not always start with 0 (or 1, if you want it that way). To setup a sequence the way you want, try

convert input.tif -scene 1 output-%3d.tif

This would start the sequence right from the count you provided.

convert -scene 1 input.TIF output-%d.TIF
output-1.TIF
output-2.TIF
output-3.TIF

Magick indeed!! :)

This link to documentation has more details. This works on my windows machine too.



回答4:

The following splits a tif file with multiple frames into tif files where each file is one frame.

def parse_tif(filePath):
    img = Image.open(filePath)
    for i in range (numFramesPerTif):
        try:
            img.seek(i)
            img.save('Block_%s.tif'%(i,))
        except EOFError: #end of file error


回答5:

You could convert it to PDF and use pyPDF to split the pages