Factor with comma and percentage to numeric

2020-05-07 02:02发布

问题:

I have a column ("rates")which is a factor with several levels such as:

16 Levels: -0,186% -0,229% -0,326% ...

When I try to convert it to numeric, NAs are introduced and I can't figure out how to do it properly.

rates=as.numeric(gsub(",", ".", rates))
rates=as.numeric(sub("%", "e-2", rates))

I also tried the following, which was the answer to a similar question, but it does not work either. rates=as.numeric(gsub("\\%", "", rates))

回答1:

Another option is to use the parse_number-function from the readr-package and specify that a comma is used as decimal mark:

library(readr)
parse_number(rates, locale = locale(decimal_mark = ','))

which gives:

[1] -0.186 -0.229 -0.326

Used data:

rates <- as.factor(c("-0,186%", "-0,229%", "-0,326%"))


回答2:

Use gsub:

# Example vector
vec <- as.factor(c("-0,186%", "-0,229%", "-0,326%"))

# Convert vector to numeric
vec <- as.numeric(gsub(",", ".", gsub("%", "", as.character(vec))))


回答3:

I assume the levels of your initial factor are chars. Then you need to do both replacements at the same time:

rates=as.numeric(gsub(",", ".", gsub("%", "e-2", rates)))