Range of integers that can be expressed precisely

2019-04-29 09:40发布

What is the exact range of (contiguous) integers that can be expressed as a double (resp. float?) The reason I ask is because I am curious for questions such as this one when a loss of accuracy will occur.

That is

  1. What is the least positive integer m such that m+1 cannot be precisely expressed as a double (resp. float)?
  2. What is the greatest negative integer -n such that -n-1 cannot be precisely expressed as a double (resp. float)? (May be the same as the above).

This means that every integer between -n and m has an exact floating-point representation. I'm basically looking for the range [-n, m] for both floats and doubles.

Let's limit the scope to the standard IEEE 754 32-bit and 64-bit floating point representations. I know that the float has 24 bits of precision and the double has 53 bits (both with a hidden leading bit), but due to the intricacies of the floating point representation I'm looking for an authoritative answer for this. Please don't wave your hands!

(Ideal answer would prove that all the integers from 0 to m are expressible, and that m+1 is not.)

1条回答
Lonely孤独者°
2楼-- · 2019-04-29 10:17

Since you're asking about IEEE floating-point types, the language does not matter.

#include <iostream>
using namespace std;

int main(){

    float f0 = 16777215.; // 2^24 - 1
    float f1 = 16777216.; // 2^24
    float f2 = 16777217.; // 2^24 + 1

    cout << (f0 == f1) << endl;
    cout << (f1 == f2) << endl;

    double d0 = 9007199254740991.; // 2^53 - 1
    double d1 = 9007199254740992.; // 2^53
    double d2 = 9007199254740993.; // 2^53 + 1

    cout << (d0 == d1) << endl;
    cout << (d1 == d2) << endl;
}

Output:

0
1
0
1

So the limit for float is 2^24. And the limit for double is 2^53. Negatives are the same since the only difference is the sign bit.

查看更多
登录 后发表回答