What is the correct/standard way to check if diffe

I often end up in situations where it is necessary to check if the obtained difference is above machine precision. Seems like for this purpose R has a handy variable: .Machine$double.eps. However when I turn to R source code for guidelines about using this value I see multiple different patterns.

Examples

Here are a few examples from stats library:

t.test.R

if(stderr < 10 *.Machine$double.eps * abs(mx))

chisq.test.R

if(abs(sum(p)-1) > sqrt(.Machine$double.eps))

integrate.R

rel.tol < max(50*.Machine$double.eps, 0.5e-28)

lm.influence.R

e[abs(e) < 100 * .Machine$double.eps * median(abs(e))] <- 0

princomp.R

if (any(ev[neg] < - 9 * .Machine$double.eps * ev[1L]))

etc.

Questions

How can one understand the reasoning behind all those different 10 *, 100 *, 50 * and sqrt() modifiers?
Are there guidelines about using .Machine$double.eps for adjusting differences due to precision issues?

标签： r floating-point rounding precision

2条回答

够拽才男人

2楼-- · 2020-05-18 05:18

The machine precision for double depends on its current value. .Machine$double.eps gives the precision when the values is 1. You can use the C function nextAfter to get the machine precision for other values.

library(Rcpp)
cppFunction("double getPrec(double x) {
  return nextafter(x, std::numeric_limits<double>::infinity()) - x;}")

(pr <- getPrec(1))
#[1] 2.220446e-16
1 + pr == 1
#[1] FALSE
1 + pr/2 == 1
#[1] TRUE
1 + (pr/2 + getPrec(pr/2)) == 1
#[1] FALSE
1 + pr/2 + pr/2 == 1
#[1] TRUE
pr/2 + pr/2 + 1 == 1
#[1] FALSE

Adding value a to value b will not change b when a is <= half of it's machine precision. Checking if the difference is smaler than machine precision is done with <. The modifiers might consider typical cases how often an addition did not show a change.

In R the machine precision can be estimated with:

getPrecR <- function(x) {
  y <- log2(pmax(.Machine$double.xmin, abs(x)))
  ifelse(x < 0 & floor(y) == y, 2^(y-1), 2^floor(y)) * .Machine$double.eps
}
getPrecR(1)
#[1] 2.220446e-16

Each double value is representing a range. For a simple addition, the range of the result depends on the reange of each summand and also the range of their sum.

library(Rcpp)
cppFunction("std::vector<double> getRange(double x) {return std::vector<double>{
   (nextafter(x, -std::numeric_limits<double>::infinity()) - x)/2.
 , (nextafter(x, std::numeric_limits<double>::infinity()) - x)/2.};}")

x <- 2^54 - 2
getRange(x)
#[1] -1  1
y <- 4.1
getRange(y)
#[1] -4.440892e-16  4.440892e-16
z <- x + y
getRange(z)
#[1] -2  2
z - x - y #Should be 0
#[1] 1.9

2^54 - 2.9 + 4.1 - (2^54 + 5.9) #Should be -4.7
#[1] 0
2^54 - 2.9 == 2^54 - 2      #Gain 0.9
2^54 - 2 + 4.1 == 2^54 + 4  #Gain 1.9
2^54 + 5.9 == 2^54 + 4      #Gain 1.9

For higher precission Rmpfr could be used.

library(Rmpfr)
mpfr("2", 1024L)^54 - 2.9 + 4.1 - (mpfr("2", 1024L)^54 + 5.9)
#[1] -4.700000000000000621724893790087662637233734130859375

In case it could be converted to integer gmp could be used (what is in Rmpfr).

library(gmp)
as.bigz("2")^54 * 10 - 29 + 41 - (as.bigz("2")^54 * 10 + 59)
#[1] -47

0人赞添加讨论(0) 举报

狗以群分

3楼-- · 2020-05-18 05:24

Definition of a machine.eps: it is the lowest value eps for which 1+eps is not 1

As a rule of thumb (assuming a floating point representation with base 2):
This eps makes the difference for the range 1 .. 2,
for the range 2 .. 4 the precision is 2*eps
and so on.

Unfortunately, there is no good rule of thumb here. It's entirely determined by the needs of your program.

In R we have all.equal as a built in way to test approximate equality. So you could use maybe something like (x<y) | all.equal(x,y)

i <- 0.1
 i <- i + 0.05
 i
if(isTRUE(all.equal(i, .15))) { #code was getting sloppy &went to multiple lines
    cat("i equals 0.15\n") 
} else {
    cat("i does not equal 0.15\n")
}
#i equals 0.15

Google mock has a number of floating point matchers for double precision comparisons, including DoubleEq and DoubleNear. You can use them in an array matcher like this:

ASSERT_THAT(vec, ElementsAre(DoubleEq(0.1), DoubleEq(0.2)));

Update:

Numerical Recipes provide a derivation to demonstrate that using a one-sided difference quotient, sqrt is a good choice of step-size for finite difference approximations of derivatives.

The Wikipedia article site Numerical Recipes, 3rd edition, Section 5.7, which is pages 229-230 (a limited number of page views is available at http://www.nrbook.com/empanel/).

all.equal(target, current,
           tolerance = .Machine$double.eps ^ 0.5, scale = NULL,
           ..., check.attributes = TRUE)

These IEEE floating point arithmetic is a well known limitation of computer arithmetic and is discussed in several places:

The FAQ in R has a whole question dedicated to it: R FAQ 7.31
- The R Inferno by Patrick Burns dedicated the first "Circle" to this problem (page 9 onward)
- Arithmetic Sum Proof Problem asked in Math Meta
- David Goldberg, "What Every Computer Scientist Should Know About Floating-point Arithmetic," ACM Computing Surveys 23, 1 (1991-03), 5-48 doi>10.1145/103162.103163 (revision also available)
- The Floating-Point Guide - What Every Programmer Should Know About Floating-Point Arithmetic
- 0.30000000000000004.com compares floating point arithmetic across programming languages
Canonical duplicate for "floating point is inaccurate" (a meta discussion about a canonical answer for this issue)
Several Stack Overflow questions including
Math Tricks Explained by Arthur T. Benjamin

. dplyr::near() is another option for testing if two vectors of floating point numbers are equal.

The function has a built in tolerance parameter: tol = .Machine$double.eps^0.5 that can be adjusted. The default parameter is the same as the default for all.equal().

0人赞添加讨论(0) 举报