This question already has an answer here:
-
Fixed-size floating point types
4 answers
I was looking to see if there is any standard type similar to uint32_t
which always would map into a 32-bit unsigned integral type but I could not find any.
Is the size of float
always 4 byte on all platform?
Is the size of double
always 8?
Does either standard say anything on the matter?
I want to make sure that my size is always the same on all platforms (x86 and x64) so I am using standard int types, but I could not find any similar typedef for float
and double
.
Excerpt from the C99 standard, normative annex F (The C++-standard does not explicitly mention this annex, though it includes all affected functions without change per reference. Also, the types have to match for compatibility.):
IEC 60559 floating-point arithmetic
F.1 Introduction
1 This annex specifies C language support for the IEC 60559 floating-point standard. The
IEC 60559 floating-point standard is specifically Binary floating-point arithmetic for
microprocessor systems, second edition (IEC 60559:1989), previously designated
IEC 559:1989 and as IEEE Standard for Binary Floating-Point Arithmetic
(ANSI/IEEE 754−1985). IEEE Standard for Radix-Independent Floating-Point
Arithmetic (ANSI/IEEE 854−1987) generalizes the binary standard to remove
dependencies on radix and word length. IEC 60559 generally refers to the floating-point
standard, as in IEC 60559 operation, IEC 60559 format, etc. An implementation that
defines __STDC_IEC_559__
shall conform to the specifications in this annex.356)
Where a binding between the C language and IEC 60559 is indicated, the
IEC 60559-specified behavior is adopted by reference, unless stated otherwise. Since
negative and positive infinity are representable in IEC 60559 formats, all real numbers lie
within the range of representable values.
So, include <math.h>
(or in C++ maybe <cmath>
), and test for __STDC_IEC_559__
.
If the macro is defined, not only are the types better specified (float
being 32bits and double
being 64bits among others), but also the behavior of builtin operators and standard-functions is more specified.
Lack of the macro does not give any guarantees.
For x86 and x86_64 (amd64), you can rely on the types float
and double
being IEC-60559-conformant, though functions using them and operations on them might not be.
Does not say anything about the size.
3.9.1.8
There are three floating point types: float, double, and long double.
The type double provides at least as much precision as float, and the
type long double provides at least as much precision as double. The
set of values of the type float is a subset of the set of values of
the type double; the set of values of the type double is a subset of
the set of values of the type long double. The value representation of
floating-point types is implementation-defined. Integral and floating
types are collectively called arithmetic types. Specializations of the
standard template std::numeric_limits (18.3) shall specify the maximum
and minimum values of each arithmetic type for an implementation.
The C++ standard doesn't say anything, but in most of the platforms C++ use the single/double precision standard from IEEE, which define single precision as 4 bytes, and double precision as 8 bytes.
http://en.wikipedia.org/wiki/Single-precision_floating-point_format
http://en.wikipedia.org/wiki/Double-precision_floating-point_format
I'm not sure about the exceptions for these cases.
As floating point operations are implemented at a low level by CPUs, the C++ standard does not mandate a size for either a float
, double
or long double
. All it says is that the order I specified them is in equal or increasing order of precision.
Your best bet is to use static_assert
, sizeof
, typedef
and #define
carefully in order to define cross platform floating point types.
You can try to use a library offering cross-platform data types compatibility.
"The integral types C++ inherited from C are a cross-platform hazard. int, long and friends have different sizes on different platforms (32-bit and 64-bit on today's systems, maybe 128-bit later). For some applications it might seem irrelevant because they never approach the 32-bit limit (or rather 31-bit if you use unsigned integers), but if you serialize your objects on a 64-bit system and deserialize on a 32-bit system you might be unpleasantly surprised.
APR provides a set of typedefs for basic types that might be different on different platforms. These typedefs provide a guaranteed size and avoid the fuzzy built-in types. However, for some applications (mostly numerical) it is sometimes important to use the native machine word size (typically what int stands for) to achieve maximal performance."
Gigi SAYFAN - Building Your Own Plugin Framework (From http://philippe.ameline.free.fr/techytechy/071125_PluginFramework.htm)
I want to point out that even if you have same size floats you can not be sure these floats are equally interpreted on different platforms. You can read a lot of papers about 'floats over network'. Floats non-determinism is a known problem.
In the case of X86, even if using IEEE single and double precision numbers, the internal calculations are affected by a floating point control word (FCW). The internal calculations are normally 64 bit or 80 bit (long double). You can override this using inline assembly code, but there's no guarantee that some double precision library function won't set it back.
Microsoft supported 80 bit long doubles with their 16 bit compilers, but dropped support for them with their 32 bit and 64 bit compilers, and long doubles are now the same as doubles at 64 bits.