converting multiple columns from character to nume

2019-01-11 14:33发布

问题:

What is the most efficient way to convert multiple columns in a data frame from character to numeric format?

I have a dataframe called DF with all character variables.

I would like to do something like

for (i in names(DF){
    DF$i <- as.numeric(DF$i)
}

Thank you

回答1:

You could try

DF <- data.frame("a" = as.character(0:5),
                 "b" = paste(0:5, ".1", sep = ""),
                 "c" = letters[1:6],
                 stringsAsFactors = FALSE)

# Check columns classes
sapply(DF, class)

#           a           b           c 
# "character" "character" "character" 

cols.num <- c("a","b")
DF[cols.num] <- sapply(DF[cols.num],as.numeric)
sapply(DF, class)

#          a           b           c 
#  "numeric"   "numeric" "character"


回答2:

You can use index of columns: data_set[,1:9] <- sapply(dataset[,1:9],as.character)



回答3:

I think I figured it out. Here's what I did (perhaps not the most elegant solution - suggestions on how to imp[rove this are very much welcome)

#names of columns in data frame
cols <- names(DF)
# character variables
cols.char <- c("fx_code","date")
#numeric variables
cols.num <- cols[!cols %in% cols.char]

DF.char <- DF[cols.char]
DF.num <- as.data.frame(lapply(DF[cols.num],as.numeric))
DF2 <- cbind(DF.char, DF.num)


回答4:

I realize this is an old thread but wanted to post a solution similar to your request for a function (just ran into the similar issue myself trying to format an entire table to percentage labels).

Assume you have a df with 5 character columns you want to convert. First, I create a table containing the names of the columns I want to manipulate:

col_to_convert <- data.frame(nrow = 1:5
                            ,col = c("col1","col2","col3","col4","col5"))

for (i in 1:max(cal_to_convert$row))
  {
    colname <- col_to_convert$col[i]
    colnum <- which(colnames(df) == colname)
        for (j in 1:nrow(df))
          {
           df[j,colnum] <- as.numericdf(df[j,colnum])
          }
  }

This is not ideal for large tables as it goes cell by cell, but it would get the job done.



回答5:

If you're already using the tidyverse, this replaces all character columns with numeric, and leaves the rest alone:

library(dplyr)
library(magrittr)

# solution
dataset %<>% mutate_if(is.character,as.numeric)

# to test
str(data.frame(x1 = c('1','2','3'),x2 = c('4','5','6'),stringsAsFactors = F))
str(data.frame(x1 = c('1','2','3'),x2 = c('4','5','6'),stringsAsFactors = F) %>% mutate_if(is.character,as.numeric))


回答6:

You could use convert from the hablar package:

library(dplyr)
library(hablar)

# Sample df (stolen from the solution by Luca Braglia)
df <- tibble("a" = as.character(0:5),
                 "b" = paste(0:5, ".1", sep = ""),
                 "c" = letters[1:6])

# insert variable names in num()
df %>% convert(num(a, b))

Which gives you:

# A tibble: 6 x 3
      a     b c    
  <dbl> <dbl> <chr>
1    0. 0.100 a    
2    1. 1.10  b    
3    2. 2.10  c    
4    3. 3.10  d    
5    4. 4.10  e    
6    5. 5.10  f   

Or if you are lazy, let retype() from hablar guess the right data type:

df %>% retype()

which gives you:

# A tibble: 6 x 3
      a     b c    
  <int> <dbl> <chr>
1     0 0.100 a    
2     1 1.10  b    
3     2 2.10  c    
4     3 3.10  d    
5     4 4.10  e    
6     5 5.10  f   


回答7:

this example from ARobertson was the most efficient I saw here. I used it to convert integers to numeric. Worked like I needed it to and no loops needed or long code.

library(dplyr)
library(magrittr)

solution

dataset %<>% mutate_if(is.integer,as.numeric)