as.numeric is rounding off values

2020-03-25 04:07发布

I am trying to convert a character column from a data frame to the numerics. However, what I am getting as a result are rounded up values.

Whatever I have tried by researching other questions of the same nature on SO, hasn't worked for me. I have checked the class of the column vector I am trying to convert, and it is a character, not a factor.

Here is my code snippet:

some_data <- read.csv("file.csv", nrows = 100, colClasses = c("factor", "factor", "character", "character"))
y <- Vectorize(function(x) gsub("[^\\.\\d]", "", x, perl = TRUE))
some_data$colC <- y(data1$colC)
data1$colD <- y(data1$colCD)

data1$colC <- as.numeric(data1$colC)
data1$colD <- as.numeric(data1$colD)

Edit:

> dput(head(data1))
structure(list(colA = structure(c(2L, 2L, 5L, 6L, 5L, 6L), .Label = c("(Other)",
"Direct", "Display", "Email", "Organic Search", "Paid Search", 
"Referral", "Social Network"), class = "factor"), colB = structure(c(1L, 
2L, 2L, 2L, 1L, 1L), .Label = c("No", "Yes"), class = "factor"), 
colC = c("4023107.87", "3180863.42", "2558777.81", "2393736.25", 
"1333148.48", "1275627.13"), colD = c("49731596.35", "33604210.26", 
"20807573.12", "20061467.30", "10488358.77", "10442249.09"
)), .Names = c("colA", "colB", "colC", "colD"), row.names = c(NA, 
6L), class = "data.frame")

标签: r numeric
1条回答
别忘想泡老子
2楼-- · 2020-03-25 04:25

I think this is a representation problem, not an actual rounding problem ...

options("digits") ## 7

From ?options:

‘digits’: controls the number of digits to print when printing numeric values. It is a suggestion only. Valid values are 1...22 with default 7. See the note in ‘print.default’ about values greater than 15.

digits can be reset either on a one-off basis, i.e. print(object,digits=...), or globally, i.e. options(digits=20) (20 is probably overkill but helps you see what's happening: based on the results below, 10 might serve your needs well.)

as.numeric(data1$colC)
[1] 4023108 3180863 2558778 2393736 1333148 1275627
print(as.numeric(data1$colC),digits=10)
[1] 4023107.87 3180863.42 2558777.81 2393736.25 1333148.48 1275627.13
print(as.numeric(data1$colC),digits=20)
[1] 4023107.8700000001118 3180863.4199999999255 2558777.8100000000559
[4] 2393736.2500000000000 1333148.4799999999814 1275627.1299999998882
查看更多
登录 后发表回答