Addressing x and y in aes by variable number

2020-01-24 02:27发布

I need to draw a scatterplot with addressing variables by their column numbers instead of names, i.e. instead of ggplot(dat, aes(x=Var1, y=Var2)) I need something like ggplot(dat, aes(x=dat[,1], y=dat[,2])). (I say 'something' because the latter doesn't work).

Here is my code:

showplot1<-function(indata, inx, iny){
  dat<-indata
  print(nrow(dat)); # this is just to show that object 'dat' is defined
  p <- ggplot(dat, aes(x=dat[,inx], y=dat[,iny]))
  p + geom_point(size=4, alpha = 0.5)
}

testdata<-data.frame(v1=rnorm(100), v2=rnorm(100), v3=rnorm(100), v4=rnorm(100), v5=rnorm(100))
showplot1(indata=testdata, inx=2, iny=3)
# Error in eval(expr, envir, enclos) : object 'dat' not found

标签: r ggplot2
6条回答
劳资没心,怎么记你
2楼-- · 2020-01-24 03:02

provisional solution I found for the moment:

showplot1<-function(indata, inx, iny){
  dat<-data.frame(myX=indata[,inx], myY=indata[,iny])
  print(nrow(dat)); # this is just to show that object 'dat' is defined
  p <- ggplot(dat, aes(x=myX, y=myY))
  p + geom_point(size=4, alpha = 0.5)
}

But I don't really like it because in my real code, I need other columns from indata and here I will have to define all of them explicitly in dat<-...

查看更多
迷人小祖宗
3楼-- · 2020-01-24 03:07

For completeness, I think it's safer to use column names instead of indices because column positions within a data frame can be changed causing unexpected results.

The plot_duo function below (taken from this answer) can use input either as strings or bare column names

library(rlang)
library(purrr)
library(dplyr)
library(ggplot2)

theme_set(theme_classic(base_size = 14))
set.seed(123456)
testdata <- data.frame(v1 = rnorm(100), v2 = rnorm(100), v3 = rnorm(100), 
                       v4 = rnorm(100), v5 = rnorm(100))

plot_duo <- function(df, plot_var_x, plot_var_y) {

  # check if input is character or bare column name to 
  # use ensym() or enquo() accordingly
  if (is.character(plot_var_x)) {
    print('character column names supplied, use ensym()')
    plot_var_x <- ensym(plot_var_x)
  } else {
    print('bare column names supplied, use enquo()')
    plot_var_x <- enquo(plot_var_x)
  }

  if (is.character(plot_var_y)) {
    plot_var_y <- ensym(plot_var_y)
  } else {
    plot_var_y <- enquo(plot_var_y)
  }

  # unquote the variables using !! (bang bang) so ggplot can evaluate them
  pts_plt <- ggplot(df, aes(x = !! plot_var_x, y = !! plot_var_y)) + 
    geom_point(size = 4, alpha = 0.5)

  return(pts_plt)
}

Apply plot_duo function across columns using purrr::map()

### use character column names
plot_vars1 <- names(testdata)
plt1 <- plot_vars1 %>% purrr::map(., ~ plot_duo(testdata, .x, "v1"))
#> [1] "character column names supplied, use ensym()"
#> [1] "character column names supplied, use ensym()"
#> [1] "character column names supplied, use ensym()"
#> [1] "character column names supplied, use ensym()"
#> [1] "character column names supplied, use ensym()"

str(plt1, max.level = 1)
#> List of 5
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"

# test plot
plt1[[3]]

### use bare column names
# Ref: https://stackoverflow.com/a/49834499/
plot_vars2 <- rlang::exprs(v2, v3, v4)
plt2 <- plot_vars2 %>% purrr::map(., ~ plot_duo(testdata, .x, rlang::expr(v1)))
#> [1] "bare column names supplied, use enquo()"
#> [1] "bare column names supplied, use enquo()"
#> [1] "bare column names supplied, use enquo()"

str(plt2, max.level = 1)
#> List of 3
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#>  $ :List of 9
#>   ..- attr(*, "class")= chr [1:2] "gg" "ggplot"

plt1[[2]]

Created on 2019-02-18 by the reprex package (v0.2.1.9000)

查看更多
混吃等死
4楼-- · 2020-01-24 03:12

Your problem is that aes doesn't know your function's environment and it only looks within global environment. So, the variable dat declared within the function is not visible to ggplot2's aes function unless you pass it explicitly as:

showplot1<-function(indata, inx, iny) {
    dat <- indata
    p <- ggplot(dat, aes(x=dat[,inx], y=dat[,iny]), environment = environment())
    p <- p + geom_point(size=4, alpha = 0.5)
    print(p)
}

Note the argument environment = environment() inside the ggplot() command. It should work now.

查看更多
▲ chillily
5楼-- · 2020-01-24 03:20

I strongly suggest using aes_q instead of passing vectors to aes (@Arun's answer). It may look a bit more complicated, but it is more flexible, when e.g. updating the data.

showplot1 <- function(indata, inx, iny){
  p <- ggplot(indata, 
              aes_q(x = as.name(names(indata)[inx]), 
                    y = as.name(names(indata)[iny])))
  p + geom_point(size=4, alpha = 0.5)
}

And here's the reason why it is preferable:

# test data (using non-standard names)
testdata<-data.frame(v1=rnorm(100), v2=rnorm(100), v3=rnorm(100), v4=rnorm(100), v5=rnorm(100))
names(testdata) <- c("a-b", "c-d", "e-f", "g-h", "i-j")
testdata2 <- data.frame(v1=rnorm(100), v2=rnorm(100), v3=rnorm(100), v4=rnorm(100), v5=rnorm(100))
names(testdata2) <- c("a-b", "c-d", "e-f", "g-h", "i-j")

# works
showplot1(indata=testdata, inx=2, iny=3)
# this update works in the aes_q version
showplot1(indata=testdata, inx=2, iny=3) %+% testdata2

Note: As of ggplot2 v2.0.0 aes_q() has been replaced with aes_() to be consistent with SE versions of NSE functions in other packages.

查看更多
Emotional °昔
6楼-- · 2020-01-24 03:23

Try:

showplot1 <- function(indata, inx, iny) {
    x <- names(indata)[inx] 
    y <- names(indata)[iny] 
    p <- ggplot(indata, aes_string(x = x, y = y))
    p + geom_point(size=4, alpha = 0.5)
}

Edited to show what's happening - aes_string uses quoted arguments, names gets them using your numbers.

查看更多
贪生不怕死
7楼-- · 2020-01-24 03:24

A variation on @Shadow's answer using new features from ggplot2 V3.0.0 :

showplot <- function(indata, inx, iny){
  nms <- names(indata)
  x <- nms[inx]
  y <- nms[iny]
  p <- ggplot(indata, aes(x = !!ensym(x), y = !!ensym(y)))
  p + geom_point(size=4, alpha = 0.5)
}   

testdata <- data.frame(v1=rnorm(100), v2=rnorm(100), v3=rnorm(100), v4=rnorm(100), v5=rnorm(100))
names(testdata) <- c("a-b", "c-d", "e-f", "g-h", "i-j")
showplot(indata=testdata, inx=2, iny=3)

ensym creates a symbol from the string contained in a variable (so we first have to create those variables at the start of the function), then !! unquotes it, which means it will work as if you had fed the function raw names.

!! works only in the context of functions designed to support it, usually tidyverse functions, else it just means "not not" (similar to as.logical)..

查看更多
登录 后发表回答