User wand by python to convert pdf to jepg, raise

2019-08-20 20:12发布

问题:

I want to convert the first page of pdf to an image. And my below code is working well in my local environment: Ubuntu 18. But when I run in the docker environment, it fails and raises:

wand.exceptions.WandRuntimeError: MagickReadImage returns false, but did raise ImageMagick exception. This can occurs when a delegate is missing, or returns EXIT_SUCCESS without generating a raster.

Am I missing a dependency? Or something else? I don't know what it's referring to as 'delegate'.

I saw the source code, it fails in here: wand/image.py::7873lines

if blob is not None:
    if not isinstance(blob, abc.Iterable):
        raise TypeError('blob must be iterable, not ' +
                        repr(blob))
    if not isinstance(blob, binary_type):
        blob = b''.join(blob)
    r = library.MagickReadImageBlob(self.wand, blob, len(blob))
elif filename is not None:
    filename = encode_filename(filename)
    r = library.MagickReadImage(self.wand, filename)
if not r:
    self.raise_exception()
    msg = ('MagickReadImage returns false, but did raise ImageMagick '
           'exception. This can occurs when a delegate is missing, or '
           'returns EXIT_SUCCESS without generating a raster.')
    raise WandRuntimeError(msg)

The line r = library.MagickReadImageBlob(self.wand, blob, len(blob)) returns true in my local environment, but in the docker it returns false. Moreover, the args blob and len(blob) is same.

def pdf2img(fp, page=0):
    """
    convert pdf to jpeg image
    :param fp: a file-like object
    :param page:
    :return: (Bool, File) if False, mean the `fp` is not pdf, if True, then the `File` is a file-like object
        contain the `jpeg` format data
    """
    try:
        reader = PdfFileReader(fp, strict=False)
    except Exception as e:
        fp.seek(0)
        return False, None
    else:
        bytes_in = io.BytesIO()
        bytes_out = io.BytesIO()
        writer = PdfFileWriter()

        writer.addPage(reader.getPage(page))
        writer.write(bytes_in)
        bytes_in.seek(0)

        im = Image(file=bytes_in, resolution=120)
        im.format = 'jpeg'
        im.save(file=bytes_out)
        bytes_out.seek(0)
        return True, bytes_out

回答1:

I don't know what it's referring to as 'delegate'.

With ImageMagick, a 'delegate' refers to any shared library, utility, or external program that does the actual encoding & decoding of file type. Specifically, a file format to a raster.

Am I missing a dependency?

Most likely. For PDF, you would need a ghostscript installed on the docker instance.

Or something else?

Possible, but hard to determine without an error message. The "WandRuntimeError" exception is a catch-all. It exists because a raster could not be generated from the PDF, and both Wand & ImageMagick can not determine why. Usually there would be an exception if the delegate failed, security policy message, or an OS error.

Best thing would be to run a few gs commands to see if ghostscript is working correctly.

gs -sDEVICE=pngalpha -o page-%03d.png -r120 input.pdf

If the above works, then try again just with ImageMagick

convert -density 120 input.pdf page-%03d.png