R scatterplot matrix with error bars

2019-09-04 00:46发布

Is there any R package/method/function that provides the functionality to plot a matrix of scatterplots as here (scatterplot.matrix function of the car package, found here) AND to plot x and y errorbars as has been asked and answered here.

An example:

set.seed(123)
df <- data.frame(X = rnorm(10), errX = rnorm(10)*0.1, Y = rnorm(10), errY = rnorm(10)*0.2, Z = rnorm(10))
require(ggplot2)
ggplot(data = df, aes(x = X, y = Y)) + geom_point() + 
  geom_errorbar(aes(ymin = Y-errY, ymax = Y+errY)) + 
  geom_errorbarh(aes(xmin = X-errX, xmax = X+errX)) + theme_bw()

produces the following plot (X vs Y with errorbars): enter image description here

while

library(car)
spm(~X+Y+Z, data=df)

produces a scatterplot matrix such as this: enter image description here

Now my expected output would be such a matrix of scatterplots (any other package than car will be fine as well) where I can also display errorbars. (Note that not all of my variables have errors, e.g. Z does not). Also the fitting etc that is done here by the spm function is a nice gimmick but not necessary for my means.

1条回答
Emotional °昔
2楼-- · 2019-09-04 01:38

Data

set.seed(123)
df <- data.frame(X = rnorm(10), errX = rnorm(10)*0.1,
                 Y = rnorm(10), errY = rnorm(10)*0.2,
                 Z = rnorm(10))

Code

library(ggplot2)
library(gtools)
valCols <- c("X", "Y", "Z")
errCols <- setNames(c("errX", "errY", NA), valCols)
combn <- permutations(length(valCols), 2, valCols)

mdf <- do.call(rbind,
               apply(combn, 1, function(ind) {
                  df[["NA.Column"]] <- NA
                  errC <- errCols[ind]
                  errC[is.na(errC)] <- "NA.Column"
                  vals <- setNames(data.frame(df[, ind]), paste0("val", seq_along(ind)))
                  errs <- setNames(data.frame(df[, errC]), paste0("err", seq_along(errC)))
                  ret <- cbind(vals, errs)
                  ret$var1 <- factor(ind[1], levels = valCols)
                  ret$var2 <- factor(ind[2], levels = valCols)
                  ret
               }))

(p <- ggplot(mdf, aes(x = val1, y = val2, 
                      ymin = val2 - err2, ymax = val2 + err2,
                      xmin = val1 - err1, xmax = val1 + err1)) +
         geom_point() + 
         geom_errorbar() + geom_errorbarh() + 
         facet_grid(var1 ~ var2, drop = FALSE))

Explanation

First, you have to transform your data in a way, such that ggplot2 likes it. That is, one column each for your x- and y-axis respectively plus one column each for the error bars.

What I used here, is function permutations from library(gtools), which returns (in this case) all 2 element permutations. For each of these permutations, I select the corresponding column from the original data set and add the related error columns (if existing). If the column names follow a certain pattern for value and error bar columns, you can use regex to determine these automatically like in:

valCols <- names(df)[grepl("^[A-Z]$", names(df))]

Finally, I add the columns var1and var2 describing which variables were selected:

head(mdf)
#          val1       val2        err1        err2 var1 var2
# 1 -0.56047565 -1.0678237  0.12240818  0.08529284    X    Y
# 2 -0.23017749 -0.2179749  0.03598138 -0.05901430    X    Y
# 3  1.55870831 -1.0260044  0.04007715  0.17902513    X    Y
# 4  0.07050839 -0.7288912  0.01106827  0.17562670    X    Y
# 5  0.12928774 -0.6250393 -0.05558411  0.16431622    X    Y
# 6  1.71506499 -1.6866933  0.17869131  0.13772805    X    Y

Having the data transformed this way makes it rather easy to generate the scatter plot matrix. With this approach it is also possible to modify the diagonal panel as shown in the follwing example:

p + geom_text(aes(ymin = NULL, ymax = NULL, xmin = NULL, xmax = NULL), 
              label = "X",
              data = data.frame(var1 = "X", var2 = "X", 
                                val1 = 0, val2 = 0))

Plot

Scatterplot Matrix Scatterplot Matrix With Diagonal Element

查看更多
登录 后发表回答