I have a melted data set which also includes data generated from normal distribution. I want to plot empirical density function of my data against normal distribution but the scales of the two produced density plots are different. I could find this post for two separate data sets:
Normalising the x scales of overlaying density plots in ggplot
but I couldn't figure out how to apply it to melted data. Suppose I have a data frame like this:
df<-data.frame(type=rep(c('A','B'),each=100),x=rnorm(200,1,2)/10,y=rnorm(200))
df.m<-melt(df)
using the code below:
qplot(value,data=df.m,col=variable,geom='density',facets=~type)
produces this graph:
How can I make the two densities comparable given the fact that normal distribution is the reference plot? (I prefer to use qplot
instead of ggplot
)
UPDATE:
I want to produce something like this (i.e. in terms of plot-comparison) but with ggplot2
:
plot(density(rnorm(200,1,2)/10),col='red',main=NA) #my data
par(new=T)
plot(density(rnorm(200)),axes=F,main=NA,xlab=NA,ylab=NA) # reference data
which generates this:
df<-data.frame(type=rep(c('A','B'),each=100),x = rnorm(200,1,2)/10, y = rnorm(200))
df.m<-melt(df)
require(data.table)
DT <- data.table(df.m)
Insert a new column with the scaled value into DT. Then plot.
This is the image code:
DT <- DT[, scaled := scale(value), by = "variable"]
str(DT)
ggplot(DT) +
geom_density(aes(x = scaled, color = variable)) +
facet_grid(. ~ type)
qplot(data = DT, x = scaled, color = variable,
facets = ~ type, geom = "density")
# Using fill (inside aes) and alpha outside(so you don't get a legend for it)
ggplot(DT) +
geom_density(aes(x = scaled, fill = variable), alpha = 0.2) +
facet_grid(. ~ type)
qplot(data = DT, x = scaled, fill = variable, geom = "density", alpha = 0.2, facets = ~type)
# Histogram
ggplot(DT, aes(x = scaled, fill = variable)) +
geom_histogram(binwidth=.2, alpha=.5, position="identity") +
facet_grid(. ~ type, scales = "free")
qplot(data = DT, x = scaled, fill = variable, alpha = 0.2, facets = ~type)
Is this what you had in mind?
There's a built-in variable, ..scaled..
that does this automatically.
set.seed(1)
df<-data.frame(type=rep(c('A','B'),each=100),x=rnorm(200,1,2)/10,y=rnorm(200))
df.m<-melt(df)
ggplot(df.m) +
stat_density(aes(x=value, y=..scaled..,color=variable), position="dodge", geom="line")