Renaming a column name, by using the data frame ti

2020-07-11 09:54发布

I have a data frame called "Something". I am doing an aggregation on one of the numeric columns using summarise, and I want the name of that column to contain "Something" - data frame title in the column name.

Example:

    temp <- Something %>% 
    group_by(Month) %>% 
    summarise(avg_score=mean(score))

But i would like to name the aggregate column as "avg_Something_score". Did that make sense?

标签: r dplyr
5条回答
爱情/是我丢掉的垃圾
2楼-- · 2020-07-11 10:02
library(dplyr)

# Take mtcars as an example
# Calculate the mean of mpg using cyl as group
data(mtcars)
Something <- mtcars

# Create a list of expression
dots <- list(~mean(mpg))

# Apply the function, Use setNames to name the column
temp <- Something %>% 
  group_by(cyl) %>% 
  summarise_(.dots =  setNames(dots, 
                               paste0("avg_", as.character(quote(Something)), "_score")))
查看更多
够拽才男人
3楼-- · 2020-07-11 10:10

We can use the devel version of dplyr (soon to be released 0.6.0) that does this with quosures

library(dplyr)
myFun <- function(data, group, value){
      dataN <- quo_name(enquo(data))
      group <- enquo(group)
      value <- enquo(value)

      newName <- paste0("avg_", dataN, "_", quo_name(value))
     data %>%
        group_by(!!group) %>%
        summarise(!!newName := mean(!!value))
 }

myFun(mtcars, cyl, mpg)
# A tibble: 3 × 2
#   cyl avg_mtcars_mpg
#  <dbl>          <dbl>
#1     4       26.66364
#2     6       19.74286
#3     8       15.10000

myFun(iris, Species, Petal.Width)
# A tibble: 3 × 2
#     Species avg_iris_Petal.Width
#     <fctr>                <dbl>
#1     setosa                0.246
#2 versicolor                1.326
#3  virginica                2.026

Here, the enquo takes the input arguments like substitute from base R and converts to quosure, with quo_name, we can convert it to string, evaluate the quosure by unquoting (!! or UQ) inside group_by/summarise/mutate etc. The column names on the lhs of assignment (:=) can also evaluated by unquoting to get the columns of interest

查看更多
淡お忘
4楼-- · 2020-07-11 10:12

It seems like it makes more sense to generate the new column name dynamically so that you don't have to hard-code the name of the data frame inside setNames. Maybe something like the function below, which takes a data frame, a grouping variable, and a numeric variable:

library(dplyr)
library(lazyeval)

my_fnc = function(data, group, value) {

  df.name = deparse(substitute(data))

  data %>%
    group_by_(group) %>%
    summarise_(avg = interp(~mean(v), v=as.name(value))) %>%
    rename_(.dots = setNames("avg", paste0("avg_", df.name, "_", value)))
}

Now let's run the function on two different data frames:

my_fnc(mtcars, "cyl", "mpg")
    cyl avg_mtcars_mpg
  <dbl>          <dbl>
1     4       26.66364
2     6       19.74286
3     8       15.10000
my_fnc(iris, "Species", "Petal.Width")
     Species avg_iris_Petal.Width
1     setosa                0.246
2 versicolor                1.326
3  virginica                2.026
查看更多
ら.Afraid
5楼-- · 2020-07-11 10:18

You can use rename_ from dplyr with deparse(substitute(Something)) like this:

Something %>%
group_by(Month) %>%
summarise(avg_score=mean(score))%>%
rename_(.dots = setNames("avg_score", 
 paste0("avg_",deparse(substitute(Something)),"_score") ))
查看更多
Emotional °昔
6楼-- · 2020-07-11 10:25

You could use colnames(Something)<-c("score","something_avg_score")

查看更多
登录 后发表回答