What is the size of float and double in C and C++?

2019-01-19 10:22发布

问题:

This question already has an answer here:

  • Fixed-size floating point types 4 answers

I was looking to see if there is any standard type similar to uint32_t which always would map into a 32-bit unsigned integral type but I could not find any.

Is the size of float always 4 byte on all platform?
Is the size of double always 8?

Does either standard say anything on the matter?

I want to make sure that my size is always the same on all platforms (x86 and x64) so I am using standard int types, but I could not find any similar typedef for float and double.

回答1:

Excerpt from the C99 standard, normative annex F (The C++-standard does not explicitly mention this annex, though it includes all affected functions without change per reference. Also, the types have to match for compatibility.):

IEC 60559 floating-point arithmetic

F.1 Introduction

1 This annex specifies C language support for the IEC 60559 floating-point standard. The IEC 60559 floating-point standard is specifically Binary floating-point arithmetic for microprocessor systems, second edition (IEC 60559:1989), previously designated IEC 559:1989 and as IEEE Standard for Binary Floating-Point Arithmetic (ANSI/IEEE 754−1985). IEEE Standard for Radix-Independent Floating-Point Arithmetic (ANSI/IEEE 854−1987) generalizes the binary standard to remove dependencies on radix and word length. IEC 60559 generally refers to the floating-point standard, as in IEC 60559 operation, IEC 60559 format, etc. An implementation that defines __STDC_IEC_559__ shall conform to the specifications in this annex.356) Where a binding between the C language and IEC 60559 is indicated, the IEC 60559-specified behavior is adopted by reference, unless stated otherwise. Since negative and positive infinity are representable in IEC 60559 formats, all real numbers lie within the range of representable values.

So, include <math.h> (or in C++ maybe <cmath>), and test for __STDC_IEC_559__.

If the macro is defined, not only are the types better specified (float being 32bits and double being 64bits among others), but also the behavior of builtin operators and standard-functions is more specified.
Lack of the macro does not give any guarantees.

For x86 and x86_64 (amd64), you can rely on the types float and double being IEC-60559-conformant, though functions using them and operations on them might not be.



回答2:

Does not say anything about the size.

3.9.1.8

There are three floating point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double. The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double. The value representation of floating-point types is implementation-defined. Integral and floating types are collectively called arithmetic types. Specializations of the standard template std::numeric_limits (18.3) shall specify the maximum and minimum values of each arithmetic type for an implementation.



回答3:

The C++ standard doesn't say anything, but in most of the platforms C++ use the single/double precision standard from IEEE, which define single precision as 4 bytes, and double precision as 8 bytes.

http://en.wikipedia.org/wiki/Single-precision_floating-point_format http://en.wikipedia.org/wiki/Double-precision_floating-point_format

I'm not sure about the exceptions for these cases.



回答4:

As floating point operations are implemented at a low level by CPUs, the C++ standard does not mandate a size for either a float, double or long double. All it says is that the order I specified them is in equal or increasing order of precision.

Your best bet is to use static_assert, sizeof, typedef and #define carefully in order to define cross platform floating point types.



回答5:

You can try to use a library offering cross-platform data types compatibility.

"The integral types C++ inherited from C are a cross-platform hazard. int, long and friends have different sizes on different platforms (32-bit and 64-bit on today's systems, maybe 128-bit later). For some applications it might seem irrelevant because they never approach the 32-bit limit (or rather 31-bit if you use unsigned integers), but if you serialize your objects on a 64-bit system and deserialize on a 32-bit system you might be unpleasantly surprised.
APR provides a set of typedefs for basic types that might be different on different platforms. These typedefs provide a guaranteed size and avoid the fuzzy built-in types. However, for some applications (mostly numerical) it is sometimes important to use the native machine word size (typically what int stands for) to achieve maximal performance."

Gigi SAYFAN - Building Your Own Plugin Framework (From http://philippe.ameline.free.fr/techytechy/071125_PluginFramework.htm)



回答6:

I want to point out that even if you have same size floats you can not be sure these floats are equally interpreted on different platforms. You can read a lot of papers about 'floats over network'. Floats non-determinism is a known problem.



回答7:

In the case of X86, even if using IEEE single and double precision numbers, the internal calculations are affected by a floating point control word (FCW). The internal calculations are normally 64 bit or 80 bit (long double). You can override this using inline assembly code, but there's no guarantee that some double precision library function won't set it back.

Microsoft supported 80 bit long doubles with their 16 bit compilers, but dropped support for them with their 32 bit and 64 bit compilers, and long doubles are now the same as doubles at 64 bits.