I am creating two Python scripts to produce some plots for a technical report. In the first script I am defining functions that produce plots from raw data on my hard-disk. Each function produces one specific kind of plot that I need. The second script is more like a batch file which is supposed to loop around those functions and store the produced plots on my hard-disk.
What I need is a way to return a plot in Python. So basically I want to do this:
fig = some_function_that_returns_a_plot(args)
fig.savefig('plot_name')
But what I do not know is how to make a plot a variable that I can return. Is this possible? Is so, how?
You can define your plotting functions like
import numpy as np
import matplotlib.pyplot as plt
# an example graph type
def fig_barh(ylabels, xvalues, title=''):
# create a new figure
fig = plt.figure()
# plot to it
yvalues = 0.1 + np.arange(len(ylabels))
plt.barh(yvalues, xvalues, figure=fig)
yvalues += 0.4
plt.yticks(yvalues, ylabels, figure=fig)
if title:
plt.title(title, figure=fig)
# return it
return fig
then use them like
from matplotlib.backends.backend_pdf import PdfPages
def write_pdf(fname, figures):
doc = PdfPages(fname)
for fig in figures:
fig.savefig(doc, format='pdf')
doc.close()
def main():
a = fig_barh(['a','b','c'], [1, 2, 3], 'Test #1')
b = fig_barh(['x','y','z'], [5, 3, 1], 'Test #2')
write_pdf('test.pdf', [a, b])
if __name__=="__main__":
main()
The currently accepted answer didn't work for me as such, as I was using scipy.stats.probplot()
to plot. I used matplotlib.pyplot.gca()
to access an Axes instance directly instead:
"""
For my plotting ideas, see:
https://pythonfordatascience.org/independent-t-test-python/
For the dataset, see:
https://github.com/Opensourcefordatascience/Data-sets
"""
# Import modules.
from scipy import stats
import matplotlib.pyplot as plt
import pandas as pd
from tempfile import gettempdir
from os import path
from slugify import slugify
# Define plot func.
def get_plots(df):
# plt.figure(): Create a new P-P plot. If we're inside a loop, and want
# a new plot for every iteration, this is important!
plt.figure()
stats.probplot(diff, plot=plt)
plt.title('Sepal Width P-P Plot')
pp_p = plt.gca() # Assign an Axes instance of the plot.
# Plot histogram. This uses pandas.DataFrame.plot(), which returns
# an instance of the Axes directly.
hist_p = df.plot(kind = 'hist', title = 'Sepal Width Histogram Plot',
figure=plt.figure()) # Create a new plot again.
return pp_p, hist_p
# Import raw data.
df = pd.read_csv('https://raw.githubusercontent.com/'
'Opensourcefordatascience/Data-sets/master//Iris_Data.csv')
# Subset the dataset.
setosa = df[(df['species'] == 'Iris-setosa')]
setosa.reset_index(inplace= True)
versicolor = df[(df['species'] == 'Iris-versicolor')]
versicolor.reset_index(inplace= True)
# Calculate a variable for analysis.
diff = setosa['sepal_width'] - versicolor['sepal_width']
# Create plots, save each of them to a temp file, and show them afterwards.
# As they're just Axes instances, we need to call get_figure() at first.
for plot in get_plots(diff):
outfn = path.join(gettempdir(), slugify(plot.title.get_text()) + '.png')
print('Saving a plot to "' + outfn + '".')
plot.get_figure().savefig(outfn)
plot.get_figure().show()