binary representation of a float

2019-05-30 20:13发布

问题:

why this two call to toBinary function compute the same output (at least under VS2010) ?

#include <iostream>
#include <bitset>
#include <limits>
using namespace std;

template<class T> bitset<sizeof(T)*CHAR_BIT> toBinary(const T num) 
{
    bitset<sizeof(T)*CHAR_BIT> mybits;
    const char * const p = reinterpret_cast<const char*>(&num);
    for (int i = sizeof(T)*CHAR_BIT-1 ; i >= 0 ; --i)
        mybits.set(i, (*(p)&(1<<i) ));
    return mybits;
}

int main() 
{
    cout << toBinary(8.9).to_string() << "\n"; 
    cout << toBinary( 8.9 + std::numeric_limits<double>::epsilon() ).to_string()  << "\n"; 
    cin.get();
}

回答1:

That epsilon is relative to 1; here, instead, you are summing it to 8.9, which is more than 8 (2^3) times bigger than 1. This means that that epsilon would change a binary digit that is three digits to the right of the rightest digit stored in that double.

If you want to notice something change, you have to add at about 8.9*epsilon.



回答2:

You have two problems. The first is that your toBinary function doesn't do what you wanted -- it should read like so (assuming a little-endian CPU):

template<class T> bitset<sizeof(T)*CHAR_BIT> toBinary(const T num)
{
    bitset<sizeof(T)*CHAR_BIT> mybits;
    const char * const p = reinterpret_cast<const char*>(&num);
    for (int i = sizeof(T)-1; i >= 0; i--)
        for (int j = CHAR_BIT-1; j >= 0; j--)
            mybits.set(i*CHAR_BIT + j, p[i] & (1 << j));
    return mybits;
}

The other problem is as Matteo describes: numeric_limits<double>::epsilon is the difference between 1.0 and the next larger representable value, not the difference between any floating point number and the next larger representable value. You can see this for yourself by modifying your program to try incrementing 0.5, 1.0, and 2.0 -- adding epsilon will increment the second-to-last bit of 0.5, the last bit of 1.0, and have no effect on 2.0.

There is a way to do what you're trying to do, though: the nextafter family of functions (they're part of C99).



标签: c++ double