可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I need to draw a scatterplot with addressing variables by their column numbers instead of names, i.e. instead of ggplot(dat, aes(x=Var1, y=Var2))
I need something like ggplot(dat, aes(x=dat[,1], y=dat[,2]))
. (I say 'something' because the latter doesn't work).
Here is my code:
showplot1<-function(indata, inx, iny){
dat<-indata
print(nrow(dat)); # this is just to show that object 'dat' is defined
p <- ggplot(dat, aes(x=dat[,inx], y=dat[,iny]))
p + geom_point(size=4, alpha = 0.5)
}
testdata<-data.frame(v1=rnorm(100), v2=rnorm(100), v3=rnorm(100), v4=rnorm(100), v5=rnorm(100))
showplot1(indata=testdata, inx=2, iny=3)
# Error in eval(expr, envir, enclos) : object 'dat' not found
回答1:
A variation on @Shadow's answer using new features from ggplot2 V3.0.0
:
showplot <- function(indata, inx, iny){
nms <- names(indata)
x <- nms[inx]
y <- nms[iny]
p <- ggplot(indata, aes(x = !!ensym(x), y = !!ensym(y)))
p + geom_point(size=4, alpha = 0.5)
}
testdata <- data.frame(v1=rnorm(100), v2=rnorm(100), v3=rnorm(100), v4=rnorm(100), v5=rnorm(100))
names(testdata) <- c("a-b", "c-d", "e-f", "g-h", "i-j")
showplot(indata=testdata, inx=2, iny=3)
ensym
creates a symbol from the string contained in a variable (so we first have to create those variables at the start of the function), then !!
unquotes it, which means it will work as if you had fed the function raw names.
!!
works only in the context of functions designed to support it, usually tidyverse functions, else it just means "not not" (similar to as.logical
)..
回答2:
Your problem is that aes
doesn't know your function's environment and it only looks within global environment
. So, the variable dat
declared within the function is not visible to ggplot2
's aes
function unless you pass it explicitly as:
showplot1<-function(indata, inx, iny) {
dat <- indata
p <- ggplot(dat, aes(x=dat[,inx], y=dat[,iny]), environment = environment())
p <- p + geom_point(size=4, alpha = 0.5)
print(p)
}
Note the argument environment = environment()
inside the ggplot()
command. It should work now.
回答3:
Try:
showplot1 <- function(indata, inx, iny) {
x <- names(indata)[inx]
y <- names(indata)[iny]
p <- ggplot(indata, aes_string(x = x, y = y))
p + geom_point(size=4, alpha = 0.5)
}
Edited to show what's happening - aes_string uses quoted arguments, names gets them using your numbers.
回答4:
I strongly suggest using aes_q
instead of passing vectors to aes
(@Arun's answer). It may look a bit more complicated, but it is more flexible, when e.g. updating the data.
showplot1 <- function(indata, inx, iny){
p <- ggplot(indata,
aes_q(x = as.name(names(indata)[inx]),
y = as.name(names(indata)[iny])))
p + geom_point(size=4, alpha = 0.5)
}
And here's the reason why it is preferable:
# test data (using non-standard names)
testdata<-data.frame(v1=rnorm(100), v2=rnorm(100), v3=rnorm(100), v4=rnorm(100), v5=rnorm(100))
names(testdata) <- c("a-b", "c-d", "e-f", "g-h", "i-j")
testdata2 <- data.frame(v1=rnorm(100), v2=rnorm(100), v3=rnorm(100), v4=rnorm(100), v5=rnorm(100))
names(testdata2) <- c("a-b", "c-d", "e-f", "g-h", "i-j")
# works
showplot1(indata=testdata, inx=2, iny=3)
# this update works in the aes_q version
showplot1(indata=testdata, inx=2, iny=3) %+% testdata2
Note: As of ggplot2 v2.0.0 aes_q()
has been replaced with aes_()
to be consistent with SE versions of NSE functions in other packages.
回答5:
For completeness, I think it's safer to use column names instead of indices because column positions within a data frame can be changed causing unexpected results.
The plot_duo
function below (taken from this answer) can use input either as strings or bare column names
library(rlang)
library(purrr)
library(dplyr)
library(ggplot2)
theme_set(theme_classic(base_size = 14))
set.seed(123456)
testdata <- data.frame(v1 = rnorm(100), v2 = rnorm(100), v3 = rnorm(100),
v4 = rnorm(100), v5 = rnorm(100))
plot_duo <- function(df, plot_var_x, plot_var_y) {
# check if input is character or bare column name to
# use ensym() or enquo() accordingly
if (is.character(plot_var_x)) {
print('character column names supplied, use ensym()')
plot_var_x <- ensym(plot_var_x)
} else {
print('bare column names supplied, use enquo()')
plot_var_x <- enquo(plot_var_x)
}
if (is.character(plot_var_y)) {
plot_var_y <- ensym(plot_var_y)
} else {
plot_var_y <- enquo(plot_var_y)
}
# unquote the variables using !! (bang bang) so ggplot can evaluate them
pts_plt <- ggplot(df, aes(x = !! plot_var_x, y = !! plot_var_y)) +
geom_point(size = 4, alpha = 0.5)
return(pts_plt)
}
Apply plot_duo
function across columns using purrr::map()
### use character column names
plot_vars1 <- names(testdata)
plt1 <- plot_vars1 %>% purrr::map(., ~ plot_duo(testdata, .x, "v1"))
#> [1] "character column names supplied, use ensym()"
#> [1] "character column names supplied, use ensym()"
#> [1] "character column names supplied, use ensym()"
#> [1] "character column names supplied, use ensym()"
#> [1] "character column names supplied, use ensym()"
str(plt1, max.level = 1)
#> List of 5
#> $ :List of 9
#> ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#> $ :List of 9
#> ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#> $ :List of 9
#> ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#> $ :List of 9
#> ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#> $ :List of 9
#> ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
# test plot
plt1[[3]]
### use bare column names
# Ref: https://stackoverflow.com/a/49834499/
plot_vars2 <- rlang::exprs(v2, v3, v4)
plt2 <- plot_vars2 %>% purrr::map(., ~ plot_duo(testdata, .x, rlang::expr(v1)))
#> [1] "bare column names supplied, use enquo()"
#> [1] "bare column names supplied, use enquo()"
#> [1] "bare column names supplied, use enquo()"
str(plt2, max.level = 1)
#> List of 3
#> $ :List of 9
#> ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#> $ :List of 9
#> ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
#> $ :List of 9
#> ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
plt1[[2]]
Created on 2019-02-18 by the reprex package (v0.2.1.9000)
回答6:
provisional solution I found for the moment:
showplot1<-function(indata, inx, iny){
dat<-data.frame(myX=indata[,inx], myY=indata[,iny])
print(nrow(dat)); # this is just to show that object 'dat' is defined
p <- ggplot(dat, aes(x=myX, y=myY))
p + geom_point(size=4, alpha = 0.5)
}
But I don't really like it because in my real code, I need other columns from indata
and here I will have to define all of them explicitly in dat<-
...