I am trying to recreate the following plot with R. Minitab describes this as a normal probability plot.
The probplot gets you most of the way there. Unfortunately, I cannot figure out how to add the confidence interval bands around this plot.
Similarly, ggplot's stat_qq() seems to present similar information with a transformed x axis. It seems that geom_smooth()
would be the likely candidate to add the bands, but I haven't figure that out.
Finally, the Getting Genetics Done guys describe something similar here.
Sample data to recreate the plot above:
x <- c(40.2, 43.1, 45.5, 44.5, 39.5, 38.5, 40.2, 41.0, 41.6, 43.1, 44.9, 42.8)
If anyone has a solution with base graphics or ggplot, I'd appreciate it!
EDIT
After looking at the details of probplot
, I've determined this is how it generates the fit line on the graph:
> xl <- quantile(x, c(0.25, 0.75))
> yl <- qnorm(c(0.25, 0.75))
> slope <- diff(yl)/diff(xl)
> int <- yl[1] - slope * xl[1]
> slope
75%
0.4151
> int
75%
-17.36
Indeed, comparing these results to what you get out of the probplot object seem to compare very well:
> check <- probplot(x)
> str(check)
List of 3
$ qdist:function (p)
$ int : Named num -17.4
..- attr(*, "names")= chr "75%"
$ slope: Named num 0.415
..- attr(*, "names")= chr "75%"
- attr(*, "class")= chr "probplot"
>
However, incorporating this information into ggplot2 or base graphics does not yield the same results.
probplot(x)
Versus:
ggplot(data = df, aes(x = x, y = y)) + geom_point() + geom_abline(intercept = int, slope = slope)
I get similar results using R's base graphics
plot(df$x, df$y)
abline(int, slope, col = "red")
Lastly, I've learned that the last two rows of the legend refer to the Anderson-Darling test for normality and can be reproduced with the nortest
package.
> ad.test(x)
Anderson-Darling normality test
data: x
A = 0.2303, p-value = 0.7502
Perhaps this will be something you can build on. By default, stat_smooth() uses level=0.95.
Try the
qqPlot
function in theQTLRel
package.I know it's an old question, but for others who also still look for a solution, have a look at
ggqqplot
from theggpubr
package.you are using the incorrect "y", they should be quantiles (labeled with probabilities). The following shows the line in the right spot:
to add the confidence bounds as in Minitab, you can do the following
and add the following two lines to your ggplot from above (in addition, replace the slope and intercept line approach with the estimated percentiles)