R dplyr:: rename and select using string variable

2019-04-30 06:15发布

问题:

I am trying to select a subset of variables in my dataframe, and rename the variables in the new dataframe. I have a large number of variables that I would need to rename. I am using

dplyr::select
dplyr::select_

Since I have number of variables to rename, I am thinking if I should use a string variable to rename, but not sure if it could be possible? Using a string helps me to manage the newname oldname mapping. Here is an example

dplyr::select
library(dplyr)
library(nycflights13) 
set.seed(123)
data <- sample_n(flights, 3)

select(data,yr=year,mon=month,deptime=dep_time)

The question how could I pass the arguments for this in a string, that is the newvariable=oldvariable arguments and then use

dplyr::select_

col_vector <- c("year", "month", "dep_time")
select_(data, .dots = col_vector)

The string I have in mind are:

rename_vector <- c("yr=year","mon=month","deptime=dep_time")

Any suggestions would be very helpful.

回答1:

Instead of using a vector, you can pass a list to .dots in dplyr::select_, where the names are the new column names and the old names are characters.

> rename_list <- list(sepal_length = "Sepal.Length", sepal_width = "Sepal.Width")
> iris %>% tbl_df %>% select_(.dots = rename_list)

Source: local data frame [150 x 2]

   sepal_length sepal_width
          (dbl)       (dbl)
1           5.1         3.5
2           4.9         3.0
3           4.7         3.2
4           4.6         3.1
5           5.0         3.6
6           5.4         3.9
7           4.6         3.4
8           5.0         3.4
9           4.4         2.9
10          4.9         3.1
..          ...         ...


回答2:

dplyr

Another option using dplyr in conjunction with setNames to pass the vector with the new column names:

iris %>%
  select(Sepal.Length, Sepal.Width) %>% 
  setNames(c("sepal_length","sepal_width")) 

Base package

setNames(iris[, c("Sepal.Length", "Sepal.Width")], 
         c("sepal_length", "sepal_width"))

data.table

library(data.table)
setnames(iris, old = c("Sepal.Length", "Sepal.Width"), new = c("sepal_length","sepal_width"))