How to assign a plot to a variable and use the var

2020-06-03 10:05发布

问题:

I am creating two Python scripts to produce some plots for a technical report. In the first script I am defining functions that produce plots from raw data on my hard-disk. Each function produces one specific kind of plot that I need. The second script is more like a batch file which is supposed to loop around those functions and store the produced plots on my hard-disk.

What I need is a way to return a plot in Python. So basically I want to do this:

fig = some_function_that_returns_a_plot(args)
fig.savefig('plot_name')

But what I do not know is how to make a plot a variable that I can return. Is this possible? Is so, how?

回答1:

You can define your plotting functions like

import numpy as np
import matplotlib.pyplot as plt

# an example graph type
def fig_barh(ylabels, xvalues, title=''):
    # create a new figure
    fig = plt.figure()

    # plot to it
    yvalues = 0.1 + np.arange(len(ylabels))
    plt.barh(yvalues, xvalues, figure=fig)
    yvalues += 0.4
    plt.yticks(yvalues, ylabels, figure=fig)
    if title:
        plt.title(title, figure=fig)

    # return it
    return fig

then use them like

from matplotlib.backends.backend_pdf import PdfPages

def write_pdf(fname, figures):
    doc = PdfPages(fname)
    for fig in figures:
        fig.savefig(doc, format='pdf')
    doc.close()

def main():
    a = fig_barh(['a','b','c'], [1, 2, 3], 'Test #1')
    b = fig_barh(['x','y','z'], [5, 3, 1], 'Test #2')
    write_pdf('test.pdf', [a, b])

if __name__=="__main__":
    main()


回答2:

The currently accepted answer didn't work for me as such, as I was using scipy.stats.probplot() to plot. I used matplotlib.pyplot.gca() to access an Axes instance directly instead:

"""
For my plotting ideas, see:
https://pythonfordatascience.org/independent-t-test-python/
For the dataset, see:
https://github.com/Opensourcefordatascience/Data-sets
"""

# Import modules.
from scipy import stats
import matplotlib.pyplot as plt
import pandas as pd
from tempfile import gettempdir
from os import path
from slugify import slugify

# Define plot func.
def get_plots(df):

    # plt.figure(): Create a new P-P plot. If we're inside a loop, and want
    #               a new plot for every iteration, this is important!
    plt.figure()
    stats.probplot(diff, plot=plt)
    plt.title('Sepal Width P-P Plot')
    pp_p = plt.gca() # Assign an Axes instance of the plot.

    # Plot histogram. This uses pandas.DataFrame.plot(), which returns
    # an instance of the Axes directly.
    hist_p = df.plot(kind = 'hist', title = 'Sepal Width Histogram Plot',
                            figure=plt.figure()) # Create a new plot again.

    return pp_p, hist_p    

# Import raw data.
df = pd.read_csv('https://raw.githubusercontent.com/'
                 'Opensourcefordatascience/Data-sets/master//Iris_Data.csv')

# Subset the dataset.
setosa = df[(df['species'] == 'Iris-setosa')]
setosa.reset_index(inplace= True)
versicolor = df[(df['species'] == 'Iris-versicolor')]
versicolor.reset_index(inplace= True)

# Calculate a variable for analysis.
diff = setosa['sepal_width'] - versicolor['sepal_width']

# Create plots, save each of them to a temp file, and show them afterwards.
# As they're just Axes instances, we need to call get_figure() at first.
for plot in get_plots(diff):
    outfn = path.join(gettempdir(), slugify(plot.title.get_text()) + '.png')
    print('Saving a plot to "' + outfn + '".')
    plot.get_figure().savefig(outfn)
    plot.get_figure().show()