I want to add a density line (a normal density actually) to a histogram.
Suppose I have the following data. I can plot the histogram by ggplot2
:
set.seed(123)
df <- data.frame(x = rbeta(10000, shape1 = 2, shape2 = 4))
ggplot(df, aes(x = x)) + geom_histogram(colour = "black", fill = "white",
binwidth = 0.01)
I can add a density line using:
ggplot(df, aes(x = x)) +
geom_histogram(aes(y = ..density..),colour = "black", fill = "white",
binwidth = 0.01) +
stat_function(fun = dnorm, args = list(mean = mean(df$x), sd = sd(df$x)))
But this is not what I actually want, I want this density line to be fitted to the count data.
I found a similar post (HERE) that offered a solution to this problem. But it did not work in my case. I need to an arbitrary expansion factor to get what I want. And this is not generalizable at all:
ef <- 100 # Expansion factor
ggplot(df, aes(x = x)) +
geom_histogram(colour = "black", fill = "white", binwidth = 0.01) +
stat_function(fun = function(x, mean, sd, n){
n * dnorm(x = x, mean = mean, sd = sd)},
args = list(mean = mean(df$x), sd = sd(df$x), n = ef))
Any clues that I can use to generalize this
- first to normal distribution,
- then to any other bin size,
- and lastly to any other distribution will be very helpful.