Using round function in R on very small values ret

2019-07-21 04:35发布

问题:

I sometimes have to deal with very low p values and present them in a tabular format. The values returned by R can have long significance digits (i.e. digits after the decimal point). Now since the p value is anyways so low , I tend to shorten them before writing them into a .xls or .tsv files.(Just to make the tables look pretty !!)

I am using R version 3.0.0 (2013-04-03)

Some background and examples :

9.881313e-208 in R will be 9.88e-208 in my table

I can use the round function in R to do this.

round(9.881313e-208, 210)
[1] 9.88e-208

However the power value of e can differ in every case, and since there are many such cases I use the following formula :-

x = 9.881313e-208
round(x,abs(floor(log10(x)-2)))   ## I came to this following trial and error
[1] 9.88e-208

I have tested this formula empirically and it works in different cases like :-

a <- c(1.345678e-150,8.543678e-250,5.555555e-303, 0.01123, 4.523456e-290)
round(a,abs(floor(log10(a)-2)))
[1] 1.35e-150 8.54e-250 5.56e-303  1.12e-02 4.52e-290

Now the problem starts when the power of e exceeds the number 306 (even 307 is fine, but starts getting strange after 308)

## Example 1:
b <- c(1.345678e-306,1.345678e-307,1.345678e-308, 1.345678e-309, 1.345678e-310)
round(b,abs(floor(log10(b)-2)))
[1] 1.35e-306 1.30e-307 1.00e-308  0.00e+00  0.00e+00

## Example 2:
b <- c(3.345678e-306,3.345678e-307,3.345678e-308, 3.345678e-309, 3.345678e-310)
round(b,abs(floor(log10(b)-2)))

[1] 3.35e-306 3.30e-307 3.00e-308  0.00e+00  0.00e+00

## Example 3:
b <- c(7.345678e-306,7.345678e-307,7.345678e-308, 7.345678e-309, 7.345678e-310)
round(b,abs(floor(log10(b)-2)))

[1] 7.35e-306 7.30e-307 7.00e-308 1.00e-308  0.00e+00

Also, I checked these directly:

round(7.356789e-306,308)
[1] 7.36e-306

round(7.356789e-307,309)
[1] 7.4e-307

round(7.356789e-308,310)
[1] 7e-308

round(7.356789e-309,311)
[1] 1e-308

round(7.356789e-310,312)
[1] 0

round(7.356789e-311,313)
[1] 0

Am I missing something trivial here or does the round function hit a resolution limit beyond e-308. I know these values are extremely low and is almost equal to zero, however I would still prefer to have exact value. I saw some answers in SO using Python for this problem, (See How right way to round a very small negative number?) but are there any suggestions as to how can I overcome this in R ?

Help much appreciated

Cheers

Ashwin

回答1:

This answer is based on the assumption that R's floating point numbers are being represented by IEEE 754 64-bit binary numbers. That is consistent with the reported results, and inherently very likely.

Doing much arithmetic on numbers with absolute magnitude below about 2e-308 is very problematic. Below that point, precision drops with magnitude. The smallest representable number, about 4.9E-324, has one significant bit in its representation. Its pair of adjacent numbers are 0 and about 1.0E-323. Any rounding error will either reduce it to zero or double it. It cannot experience a subtle rounding error that only affects low significance digits in its decimal representation. Similarly, round cannot change it slightly. It can return it unchanged, double it, return 0, or make an even bigger change.

See Denormal number for more explanation of what is happening.

The solution is to, if at all possible, avoid doing arithmetic on such tiny numbers. One good approach, as already pointed out in comments, is to use logarithms. That is the only way to go if you need to deal with very large as well as very small numbers.

If that is not possible, and your numbers are all small, consider scaling by a moderately large power of two. Powers of two that are in range are exactly representable, and multiplying by them only changes the exponent, with no rounding error. You could scale all the numbers you are going to store in the text file by a constant before rounding.



回答2:

To overcome this, you can use the Rmpfr library.

Example:

library(Rmpfr)
x <- mpfr(7.356789e-311, 80)
round(x, 313)
1 'mpfr' number of precision  1040   bits 
[1]7.36000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000008e-311