I'm currently working on a C++ project which does numerical calculations. The vast, vast majority of the code uses single precision floating point values and works perfectly fine with that. Because of this I use compiler flags to make basic floating point literals single precision instead of the double precision, which is the default. I find that this makes expressions easier to read and I don't have to worry about forgetting a 'f' somewhere. However, every now and then I need the extra precision offered by double precision calculations and my question is how I can get a double precision literal into such an expression. Every way I've tried so far first store the value in a single precision variable and the converts the truncated value to a double precision value. Not what I want.
Some ways I've tried so far is given below.
#include <iostream>
int main()
{
std::cout << sizeof(1.0E200) << std::endl;
std::cout << 1.0E200 << std::endl;
std::cout << sizeof(1.0E200L) << std::endl;
std::cout << 1.0E200L << std::endl;
std::cout << sizeof(double(1.0E200)) << std::endl;
std::cout << double(1.0E200) << std::endl;
std::cout << sizeof(static_cast<double>(1.0E200)) << std::endl;
std::cout << static_cast<double>(1.0E200) << std::endl;
return 0;
}
A run with single precision constants give the following results.
~/path$ g++ test.cpp -fsingle-precision-constant && ./a.out
test.cpp:6:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:7:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:12:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:13:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:15:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:16:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
4
inf
16
1e+200
8
inf
8
inf
It is my understanding that the 8 bytes provided by the last two cases should be enough to hold 1.0E200, a theory supported by the following output, where the same program is compiled without -fsingle-precision-constant.
~/path$ g++ test.cpp && ./a.out
8
1e+200
16
1e+200
8
1e+200
8
1e+200
A possible workaround suggested by the above examples is to use quadruple precision floating point literals everywhere I originally intended to use double precision, and cast to double precision whenever required by libraries and such. However, this feels a bit wasteful.
What else can I do?