Addressing x and y in aes by variable number

2020-01-24 02:33发布

站内文章 / 前端开发

8 0

我欲成王，谁敢阻挡

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I need to draw a scatterplot with addressing variables by their column numbers instead of names, i.e. instead of ggplot(dat, aes(x=Var1, y=Var2)) I need something like ggplot(dat, aes(x=dat[,1], y=dat[,2])). (I say 'something' because the latter doesn't work).

Here is my code:

showplot1<-function(indata, inx, iny){
  dat<-indata
  print(nrow(dat)); # this is just to show that object 'dat' is defined
  p <- ggplot(dat, aes(x=dat[,inx], y=dat[,iny]))
  p + geom_point(size=4, alpha = 0.5)
}

testdata<-data.frame(v1=rnorm(100), v2=rnorm(100), v3=rnorm(100), v4=rnorm(100), v5=rnorm(100))
showplot1(indata=testdata, inx=2, iny=3)

# Error in eval(expr, envir, enclos) : object 'dat' not found

回答1:

A variation on @Shadow's answer using new features from ggplot2 V3.0.0 :

showplot <- function(indata, inx, iny){
  nms <- names(indata)
  x <- nms[inx]
  y <- nms[iny]
  p <- ggplot(indata, aes(x = !!ensym(x), y = !!ensym(y)))
  p + geom_point(size=4, alpha = 0.5)
}   

testdata <- data.frame(v1=rnorm(100), v2=rnorm(100), v3=rnorm(100), v4=rnorm(100), v5=rnorm(100))
names(testdata) <- c("a-b", "c-d", "e-f", "g-h", "i-j")
showplot(indata=testdata, inx=2, iny=3)

ensym creates a symbol from the string contained in a variable (so we first have to create those variables at the start of the function), then !! unquotes it, which means it will work as if you had fed the function raw names.

!! works only in the context of functions designed to support it, usually tidyverse functions, else it just means "not not" (similar to as.logical)..

回答2:

Your problem is that aes doesn't know your function's environment and it only looks within global environment. So, the variable dat declared within the function is not visible to ggplot2's aes function unless you pass it explicitly as:

showplot1<-function(indata, inx, iny) {
    dat <- indata
    p <- ggplot(dat, aes(x=dat[,inx], y=dat[,iny]), environment = environment())
    p <- p + geom_point(size=4, alpha = 0.5)
    print(p)
}

Note the argument environment = environment() inside the ggplot() command. It should work now.

回答3:

Try:

showplot1 <- function(indata, inx, iny) {
    x <- names(indata)[inx] 
    y <- names(indata)[iny] 
    p <- ggplot(indata, aes_string(x = x, y = y))
    p + geom_point(size=4, alpha = 0.5)
}

Edited to show what's happening - aes_string uses quoted arguments, names gets them using your numbers.

回答4:

I strongly suggest using aes_q instead of passing vectors to aes (@Arun's answer). It may look a bit more complicated, but it is more flexible, when e.g. updating the data.

showplot1 <- function(indata, inx, iny){
  p <- ggplot(indata, 
              aes_q(x = as.name(names(indata)[inx]), 
                    y = as.name(names(indata)[iny])))
  p + geom_point(size=4, alpha = 0.5)
}

And here's the reason why it is preferable:

# test data (using non-standard names)
testdata<-data.frame(v1=rnorm(100), v2=rnorm(100), v3=rnorm(100), v4=rnorm(100), v5=rnorm(100))
names(testdata) <- c("a-b", "c-d", "e-f", "g-h", "i-j")
testdata2 <- data.frame(v1=rnorm(100), v2=rnorm(100), v3=rnorm(100), v4=rnorm(100), v5=rnorm(100))
names(testdata2) <- c("a-b", "c-d", "e-f", "g-h", "i-j")

# works
showplot1(indata=testdata, inx=2, iny=3)
# this update works in the aes_q version
showplot1(indata=testdata, inx=2, iny=3) %+% testdata2

Note: As of ggplot2 v2.0.0 aes_q() has been replaced with aes_() to be consistent with SE versions of NSE functions in other packages.

回答5:

For completeness, I think it's safer to use column names instead of indices because column positions within a data frame can be changed causing unexpected results.

The plot_duo function below (taken from this answer) can use input either as strings or bare column names

library(rlang)
library(purrr)
library(dplyr)
library(ggplot2)

theme_set(theme_classic(base_size = 14))
set.seed(123456)
testdata <- data.frame(v1 = rnorm(100), v2 = rnorm(100), v3 = rnorm(100), 
                       v4 = rnorm(100), v5 = rnorm(100))

plot_duo <- function(df, plot_var_x, plot_var_y) {

  # check if input is character or bare column name to 
  # use ensym() or enquo() accordingly
  if (is.character(plot_var_x)) {
    print('character column names supplied, use ensym()')
    plot_var_x <- ensym(plot_var_x)
  } else {
    print('bare column names supplied, use enquo()')
    plot_var_x <- enquo(plot_var_x)
  }

  if (is.character(plot_var_y)) {
    plot_var_y <- ensym(plot_var_y)
  } else {
    plot_var_y <- enquo(plot_var_y)
  }

  # unquote the variables using !! (bang bang) so ggplot can evaluate them
  pts_plt <- ggplot(df, aes(x = !! plot_var_x, y = !! plot_var_y)) + 
    geom_point(size = 4, alpha = 0.5)

  return(pts_plt)
}

Apply plot_duo function across columns using purrr::map()

### use character column names
plot_vars1 <- names(testdata)
plt1 <- plot_vars1 %>% purrr::map(., ~ plot_duo(testdata, .x, "v1"))
#> [1] "character column names supplied, use ensym()"
#> [1] "character column names supplied, use ensym()"
#> [1] "character column names supplied, use ensym()"
#> [1] "character column names supplied, use ensym()"
#> [1] "character column names supplied, use ensym()"

str(plt1, max.level = 1)
#> List of 5
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"

# test plot
plt1[[3]]

### use bare column names
# Ref: https://stackoverflow.com/a/49834499/
plot_vars2 <- rlang::exprs(v2, v3, v4)
plt2 <- plot_vars2 %>% purrr::map(., ~ plot_duo(testdata, .x, rlang::expr(v1)))
#> [1] "bare column names supplied, use enquo()"
#> [1] "bare column names supplied, use enquo()"
#> [1] "bare column names supplied, use enquo()"

str(plt2, max.level = 1)
#> List of 3
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"

plt1[[2]]

^{Created on 2019-02-18 by the reprex package (v0.2.1.9000)}

回答6:

provisional solution I found for the moment:

showplot1<-function(indata, inx, iny){
  dat<-data.frame(myX=indata[,inx], myY=indata[,iny])
  print(nrow(dat)); # this is just to show that object 'dat' is defined
  p <- ggplot(dat, aes(x=myX, y=myY))
  p + geom_point(size=4, alpha = 0.5)
}

But I don't really like it because in my real code, I need other columns from indata and here I will have to define all of them explicitly in dat<-...