The following code is obviously wrong. What's the problem?
i <- 0.1
i <- i + 0.05
i
## [1] 0.15
if(i==0.15) cat("i equals 0.15") else cat("i does not equal 0.15")
## i does not equal 0.15
The following code is obviously wrong. What's the problem?
i <- 0.1
i <- i + 0.05
i
## [1] 0.15
if(i==0.15) cat("i equals 0.15") else cat("i does not equal 0.15")
## i does not equal 0.15
Adding to Brian's comment (which is the reason) you can over come this by using
all.equal
instead:Per Joshua's warning here is the updated code (Thanks Joshua):
dplyr::near()
is an option for testing if two vectors of floating point numbers are equal. This is the example from the docs:The function has a built in tolerance parameter:
tol = .Machine$double.eps^0.5
that can be adjusted. The default parameter is the same as the default forall.equal()
.Generalized comparisons ("<=", ">=", "=") in double precion arithmetic:
Comparing a <= b:
Comparing a >= b:
Comparing a = b:
I had a similar problem. I used the following solution.
output of unequal cut intervals based on options(digits = 2):
output of equal cut intervals based on round function:
General (language agnostic) reason
Since not all numbers can be represented exactly in IEEE floating point arithmetic (the standard that almost all computers use to represent decimal numbers and do math with them), you will not always get what you expected. This is especially true because some values which are simple, finite decimals (such as 0.1 and 0.05) are not represented exactly in the computer and so the results of arithmetic on them may not give a result that is identical to a direct representation of the "known" answer.
This is a well known limitation of computer arithmetic and is discussed in several places:
Comparing scalars
The standard solution to this in
R
is not to use==
, but rather theall.equal
function. Or rather, sinceall.equal
gives lots of detail about the differences if there are any,isTRUE(all.equal(...))
.yields
Some more examples of using
all.equal
instead of==
(the last example is supposed to show that this will correctly show differences).Some more detail, directly copied from an answer to a similar question:
The problem you have encountered is that floating point cannot represent decimal fractions exactly in most cases, which means you will frequently find that exact matches fail.
while R lies slightly when you say:
You can find out what it really thinks in decimal:
You can see these numbers are different, but the representation is a bit unwieldy. If we look at them in binary (well, hex, which is equivalent) we get a clearer picture:
You can see that they differ by
2^-53
, which is important because this number is the smallest representable difference between two numbers whose value is close to 1, as this is.We can find out for any given computer what this smallest representable number is by looking in R's machine field:
You can use this fact to create a 'nearly equals' function which checks that the difference is close to the smallest representable number in floating point. In fact this already exists:
all.equal
.So the all.equal function is actually checking that the difference between the numbers is the square root of the smallest difference between two mantissas.
This algorithm goes a bit funny near extremely small numbers called denormals, but you don't need to worry about that.
Comparing vectors
The above discussion assumed a comparison of two single values. In R, there are no scalars, just vectors and implicit vectorization is a strength of the language. For comparing the value of vectors element-wise, the previous principles hold, but the implementation is slightly different.
==
is vectorized (does an element-wise comparison) whileall.equal
compares the whole vectors as a single entity.Using the previous examples
==
does not give the "expected" result andall.equal
does not perform element-wiseRather, a version which loops over the two vectors must be used
If a functional version of this is desired, it can be written
which can be called as just
Alternatively, instead of wrapping
all.equal
in even more function calls, you can just replicate the relevant internals ofall.equal.numeric
and use implicit vectorization:This is the approach taken by
dplyr::near
, which documents itself asThis is hackish, but quick: