I am able to extract the necessary information using R, but for consistency within the overall project, I would like to be able to do it with Python (preferably Python3). I need the contents of a single tag called "Settings". This tag contains XML which will then need to be parsed.
Getting the metadata in R is incredibly easy:
library(exifr)
library(XML)
path = file.path('path', 'to', 'file')
x = read_exif(file.path(path,'image.png'))
x$Settings
It doesn't look like Python can do it, which boggles my mind. Or doing so requires me to have far more knowledge of Python and PNGs than I have at the moment. How can I extract PNG metadata using Python?
Here's the list of things I've tried:
PyPng PyPNG seems promising. Examining the length of each chunk, it seems likely the "Settings" tag lives in the zTXt chunk.
import png
filename = "C:\\path\\to\\image.png"
im = png.Reader(filename)
for c in im.chunks():
print(c[0], len(c[1]))
>>>
IHDR 13
tIME 7
pHYs 9
IDAT 47775
zTXt 714
IEND 0
The above was taken from this post. However, it's still unclear how to extract the zTXt data.
hachoir3
Using the hachoir3
package, I tried the following:
from hachoir.parser import createParser
from hachoir.metadata import extractMetadata
filename = "C:\\path\\to\\file\\image.png"
parser = createParser(filename)
metadata = extractMetadata(parser)
for line in metadata.exportPlaintext():
print(line)
This gives me the following:
Metadata:
- Image width: 1024 pixels
- Image height: 46 pixels
- Bits/pixel: 16
- Pixel format: RGB
- Compression rate: 2.0x
- Image DPI width: 1 DPI
- Image DPI height: 1 DPI
- Creation date: 2016-07-13 19:09:28
- Compression: deflate
- MIME type: image/png
- Endianness: Big endian
I can't seem to get at the field I need, the "Settings" one referenced in the R code. I've had no luck with other methods, such as metadata.get
. As far as I can tell, those seem to be the two options for parsing PNG metadata. The docs read,
Some good (but not perfect ;-)) parsers:
Matroska video Microsoft RIFF (AVI video, WAV audio, CDA file) PNG picture TAR and ZIP archive
Maybe it just doesn't have the functionality I need?
Pillow
Following the advice given in this post:
from PIL import Image
filename = "C:\\path\\to\\file\\image.png"
im = Image.open(filename)
This reads in the image, but im.info
only returns {'aspect': (1, 1)}
. Reading through the documentation, it doesn't look like any of the methods get at the metadata. I read through the PNG description provided in the post. Honestly, I don't know how to make use of its information nor how Pillow would facilitate me.
There are some posts which imply that what I need can be done, but they do not work. For example, this post suggests using the ExifTags library:
from PIL import Image, ExifTags
filename = "C:\\path\\to\\file\\image.png"
im = Image.open(filename)
exif = { ExifTags.TAGS[k]: v for k, v in im._getexif().items() if k in ExifTags.TAGS}
The problem is, AttributeError: 'PngImageFile' object has no attribute '_getexif'
. According to the documentation, the ._getexif
feature is experimental and only applies to JPGs.
Reading through the overall Pillow documentation, it really only talks about JPG and TIFF. Processing PNG files doesn't seem to be part of the package at all. So like hachoir
, maybe it can't be done?
PIL
There's apparently another package PIL from which Pillow was forked. It looks like it was abandoned in 2009.
Here is an inelegant and clumsy but working solution.
Adapted from here: https://motherboard.vice.com/en_us/article/aekn58/hack-this-extra-image-metadata-using-python
You can call the command line exiftools app from within python and then parse the results.
Below is the code which works in Python 3.6.3 under Ubuntu 16.04:
It produces the following results for my test image: