I have been working at producing some boxplots that include an image that shows what the parts of boxplots represent. As shown in the top right here. This is fine.
But, if I remove the green dashed line, which removes the legend, then the little boxplot image is cropped out when I view the plot in jupyter, but also if I save it as image file.e.g.
The solution offered here using 'tight' doesn't work, i.e.:
plt.savefig("test1.jpg", dpi=300,bbox_inches='tight')
Nor does:
plt.tight_layout()
I've also tried using AnnotationBbox but can't find a solution.
Working example code below:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib
import matplotlib.pyplot as plt
from matplotlib.offsetbox import (TextArea, DrawingArea, OffsetImage,
AnnotationBbox)
df = pd.DataFrame(np.random.randn(40, 4), columns=list('ABCD'))
df['Class']=list('ADFADAFDADFAFDAADFAFDAFDDFADFAFDADDFDFAD')
assay=df
factor_to_plot='A'
f=factor_to_plot
x_axis_factor='Class'
g=x_axis_factor
pcntls=assay.groupby([g]).describe(percentiles=[0.05,0.1,0.25,0.5,0.75,0.9,0.95])
sumry= pcntls[f].T
#print sumry
ordered=sorted(assay[g].dropna().unique())
#set figure size and scale text
plt.rcParams['figure.figsize']=(15,10)
plt.rcParams['figure.dpi'] = 300
text_scaling=1.9
sns.set(style="whitegrid")
sns.set_context("paper", font_scale=text_scaling)
#plot boxplot
ax=sns.boxplot(x=assay[g],y=assay[f],width=0.5,order=ordered, whis=[10,90],data=assay, showfliers=False,color='lightblue',
showmeans=True,meanprops={"marker":"x","markersize":12,"markerfacecolor":"white", "markeredgecolor":"black"})
#add dashed line at a value
plt.axhline(0.3, color='green',linestyle='dashed', label="S%=0.3")
#this line sets the scale to logarithmic
#ax.set_yscale('log')
#add legend for dashed line
#plt.legend(markerscale=1.5,loc='center left',bbox_to_anchor=(1.0, 0.5))
#plt.title("Assay data")
#add gridlines (use for log plots)
plt.grid(True, which='both')
#plot additional percentiles not included in boxplots
ax.scatter(x=sorted(list(sumry.columns.values)),y=sumry.loc['5%'],s=120,color='white',edgecolor='black')
ax.scatter(x=sorted(list(sumry.columns.values)),y=sumry.loc['95%'],s=120,color='white',edgecolor='black')
#next line is important, select a column that has no blanks or nans as the total items are counted to produce
#N= annotations to plot.
assay['value']=assay['B']
vals=assay.groupby([g])['value'].count()
j=vals
ymin, ymax = ax.get_ylim()
xmin, xmax = ax.get_xlim()
#print ymax
#put n= values at top of plot
x=0
for i in range(len(j)):
plt.text(x = x , y = ymax, s = "N=\n" +str(int(j[i])),horizontalalignment='center')
#plt.text(x = x , y = 102.75, s = "n=",horizontalalignment='center')
x+=1
#add legend image
img = plt.imread("legend4.jpg")
plt.figimage(img, 3900,1800, zorder=1, alpha=1)
'''xy = [1.1, 0.8]
fn = "legend4.jpg"
arr_img = plt.imread(fn, format='jpg')
imagebox = OffsetImage(arr_img, zoom=0.2)
imagebox.image.axes = ax
ab = AnnotationBbox(imagebox, xy,
boxcoords="figure fraction",
)
ax.add_artist(ab)'''
#plt.tight_layout()
#use the section below to adjust the y axis lable format to avoid default of 10^1 etc for log scale plots.
#ylabels = ['{:.1f}'.format(y) for y in ax.get_yticks()]
#ax.set_yticklabels(ylabels)
plt.savefig("test1.jpg", dpi=300,bbox_inches='tight')
Using guidance from @ImportanceOfBeingErnest I have got what I required but in saved plots only, the jupyter inline displays are still cropped (not showing annotation box) using the code below:
I found that I had to delete the 'bbox_inches='tight' part from the final save statement or I found that my 'N=...' text annotations were being cropped. Can't believe it is this hard to do!
gives: