I want to be able to ascertain the provenance of the figures I create using matplotlib, i.e. to know which version of my code and data created these figures. (See this essay for more on provenance.)
I imagine the most straightforward approach would be to add the revision numbers of the code and data to the metadata of the saved figures, or as comments in a postscript file for example.
Is there any easy way to do this in Matplotlib? The savefig
function doesn't seem to be capable of this but has someone come up with a workable solution?
I don't know of a way using matplotlib
, but you can add metadata to png's with PIL
:
f = "test.png"
METADATA = {"version":"1.0", "OP":"ihuston"}
# Create a sample image
import pylab as plt
import numpy as np
X = np.random.random((50,50))
plt.imshow(X)
plt.savefig(f)
# Use PIL to save some image metadata
from PIL import Image
from PIL import PngImagePlugin
im = Image.open(f)
meta = PngImagePlugin.PngInfo()
for x in METADATA:
meta.add_text(x, METADATA[x])
im.save(f, "png", pnginfo=meta)
im2 = Image.open(f)
print im2.info
This gives:
{'version': '1.0', 'OP': 'ihuston'}
If you are interested in PDF files, then you can have a look at the matplotlib module matplotlib.backends.backend_pdf
. At this link there is a nice example of its usage, which could be "condensed" into the following:
import pylab as pl
import numpy as np
from matplotlib.backends.backend_pdf import PdfPages
pdffig = PdfPages('figure.pdf')
x=np.arange(10)
pl.plot(x)
pl.savefig(pdffig, format="pdf")
metadata = pdffig.infodict()
metadata['Title'] = 'Example'
metadata['Author'] = 'Pluto'
metadata['Subject'] = 'How to add metadata to a PDF file within matplotlib'
metadata['Keywords'] = 'PdfPages example'
pdffig.close()
If you are generating SVG files, you can simply append text as an XML comment at the end of the SVG file. Editors like Inkscape appear to preserve this text, even if you subsequently edit an image.
Here's an example, based on the answer from Hooked:
import pylab as plt
import numpy as np
f = "figure.svg"
X = np.random.random((50,50))
plt.imshow(X)
plt.savefig(f)
open(f, 'a').write("<!-- Here is some invisible metadata. -->\n")
As of matplotlib version 2.1.0, the savefig command accepts the keyword argument metadata
. You pass in a dictionary with string key/value pairs to be saved.
This only fully works with certain the 'agg'
backend for PNG files.
For PDF and PS files you can use a pre-defined list of tags.