R numeric to char precision loss

2019-08-21 02:56发布

I want to convert my many-digit numeric vector to character. I tried the following solutions here which works for one number but not for a vector. This is OK

options(digits=20)
options(scipen=99999)
x<-129483.19999999999709;format(round(x, 12), nsmall = 12)
[1] "129483.199999999997"

But this is not. how to keep numeric precision in characters for numeric vectors?

> y <- c(129483.19999999999709, 1.3546746874,687676846.2546746464)

Specially problematic is 687676846.2546746464 Also tried:

> specify_decimal(y, 12)
[1] "129483.199999999997"    "1.354674687400"         "687676846.254674673080"
> formatC(y, digits = 12, format = "f")
[1] "129483.199999999997"    "1.354674687400"         "687676846.254674673080"
> formattable(y, digits = 12, format = "f")
[1] 129483.199999999997    1.354674687400         687676846.254674673080
> sprintf(y, fmt='%#.12g')
[1] "129483.200000" "1.35467468740" "687676846.255"
> sprintf(y, fmt='%#.22g')
[1] "129483.1999999999970896" "1.354674687399999966075" "687676846.2546746730804"

Expected result:

[1] "129483.199999999997" "1.354674687400" "687676846.254674646400"

It seems that precision loss occurs once only, it is not repeated.

> require(dplyr)
> convert <- function(x) as.numeric(as.character(x))
> 687676846.2546746464 %>% convert
[1] 687676846.25467503
> 687676846.2546746464 %>% convert %>% convert %>% convert
[1] 687676846.25467503

Here I only have 5-digit precision, but more problematic I can't know beforehand what precision I am going to get..

1条回答
劫难
2楼-- · 2019-08-21 03:20

At the end I could do what I wanted using these functions. addtrailingzeroes will add a number of zeroes after decimal to x.

nbdec <- function(x) {
  x1 <- as.character(x)
  xsplit <- strsplit(x1,"\\.")
  xlength <- sapply(xsplit, function(d) nchar(d)[2])
  xlength <- ifelse(is.na(xlength), 0, xlength)
  return(xlength)
}

trailingzeroes <- function(x, dig) {
  res <- rep(NA, length(x))
  for( i in 1:length(x)) {
    if(!is.na(x[i])) res[i] <- { paste0(rep(0,max(0,dig-nbdec(x[i]))), collapse="") }
    else { res[i] <- ""}
    }
return(res)
}

trailingcommas <- function(x) ifelse(is.na(x), NA, ifelse(nbdec(x)==0, ".",""))

addtrailingzeroes <- function(x, digits) {
  return(ifelse(!is.na(x), paste0(x, trailingcommas(x), trailingzeroes(x, digits)),NA))
}

However to suppress inaccuracies and rounding mistakes, x has to be cropped first using roundnumerics.max:

roundnumerics.max <- function(df, startdig=12) {
  for(icol in 1:ncol(df)) {
    if( is.numeric(df[,icol])) {
      dig <- startdig
      while(any(!as.numeric(as.character(df[,icol])) %==% df[,icol])) {
        dig <- dig-1
        df[,icol] <- round(df[,icol], digits=dig)
        if(dig==0) {
          break
          pprint("ERROR: zero numeric accuracy")
        }
      } 
      pprint("Numeric accuracy for column ",icol," ", colnames(df)[icol], " is ", dig)
    }
  }
  return(data.frame(df, stringsAsFactors = F))
}

This is slow and far from elegant... I still think it hard to believe that R has such an accuracy limitation to 16 significant digits, and adds inaccurate noise that causes divergences when you try to increase the digits option...Without letting you know...

查看更多
登录 后发表回答