What is the unit of the y-axis when using distplot to plot a histogram? I have plotted different histograms together with a normal fit and I see that in one case, it has a range of 0 to 0.9 while in another a range of 0 to 4.5.
Thank you.
What is the unit of the y-axis when using distplot to plot a histogram? I have plotted different histograms together with a normal fit and I see that in one case, it has a range of 0 to 0.9 while in another a range of 0 to 4.5.
Thank you.
From help(sns.distplot)
:
norm_hist : bool, otional If True, the histogram height shows a density rather than a count. This is implied if a KDE or fitted density is plotted.
A density is scaled so that the area under the curve is 1, so no individual bin will ever be taller than 1 (the whole dataset)[2]. But kde is on by default and overrides norm_hist, so norm_hist changes the y-units only if you explicitly turn kde off:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
fig, axs = plt.subplots(figsize=(6,6), ncols=2, nrows=2)
data = np.random.randint(0,20,40)
for row in (0,1):
for col in (0,1):
sns.distplot(data, kde=row, norm_hist=col, ax=axs[row, col])
axs[0,0].set_ylabel('NO kernel density')
axs[1,0].set_ylabel('KDE on')
axs[1,0].set_xlabel('norm_hist=False')
axs[1,1].set_xlabel('norm_hist=True')
[2] clarification from mwaskom, thanks!