I have the following code to compare two distributions:
sns.kdeplot(df['term'][df['outcome'] == 0], shade=1, color='red')
sns.kdeplot(df['term'][df['outcome'] == 1], shade=1, color='green');
It looks like this:
How do to plot just the difference of both distributions (disA - disB)? Of course, it could contain negative values.
Since the difference between two kde curves is not a kde curve itself, you cannot use
kdeplot
to plot that difference.A kde is easily calculated using
scipy.stats.gaussian_kde
. The result is easily plotted with pyplot.Mind that the result is really just the difference between the curves (as being asked for); it has no statistical relevance at all.