size guarantee for integral/arithmetic types in C

2020-02-08 18:19发布

问题:

I know that the C++ standard explicitly guarantees the size of only char, signed char and unsigned char. Also it gives guarantees that, say, short is at least as big as char, int as big as short etc. But no explicit guarantees about absolute value of, say, sizeof(int). This was the info in my head and I lived happily with it. Some time ago, however, I came across a comment in SO (can't find it) that in C long is guaranteed to be at least 4 bytes, and that requirement is "inherited" by C++. Is that the case? If so, what other implicit guarantees do we have for the sizes of arithmetic types in C++? Please note that I am absolutely not interested in practical guarantees across different platforms in this question, just theoretical ones.

回答1:

18.2.2 guarantees that <climits> has the same contents as the C library header <limits.h>.

The ISO C90 standard is tricky to get hold of, which is a shame considering that C++ relies on it, but the section "Numerical limits" (numbered 2.2.4.2 in a random draft I tracked down on one occasion and have lying around) gives minimum values for the INT_MAX etc. constants in <limits.h>. For example ULONG_MAX must be at least 4294967295, from which we deduce that the width of long is at least 32 bits.

There are similar restrictions in the C99 standard, but of course those aren't the ones referenced by C++03.

This does not guarantee that long is at least 4 bytes, since in C and C++ "byte" is basically defined to mean "char", and it is not guaranteed that CHAR_BIT is 8 in C or C++. CHAR_BIT == 8 is guaranteed by both POSIX and Windows.



回答2:

Don't know about C++. In C you have


                                  Annex E
                              (informative)


                          Implementation limits

       [#1]  The contents of the header  are given below,
       in alphabetical order.  The minimum magnitudes  shown  shall
       be  replaced  by  implementation-defined magnitudes with the
       same sign.  The values shall  all  be  constant  expressions
       suitable  for  use  in  #if  preprocessing  directives.  The
       components are described further in 5.2.4.2.1.

               #define CHAR_BIT                         8
               #define CHAR_MAX    UCHAR_MAX or SCHAR_MAX
               #define CHAR_MIN            0 or SCHAR_MIN
               #define INT_MAX                     +32767
               #define INT_MIN                     -32767
               #define LONG_MAX               +2147483647
               #define LONG_MIN               -2147483647
               #define LLONG_MAX     +9223372036854775807
               #define LLONG_MIN     -9223372036854775807
               #define MB_LEN_MAX                       1
               #define SCHAR_MAX                     +127
               #define SCHAR_MIN                     -127
               #define SHRT_MAX                    +32767
               #define SHRT_MIN                    -32767
               #define UCHAR_MAX                      255
               #define USHRT_MAX                    65535
               #define UINT_MAX                     65535
               #define ULONG_MAX               4294967295
               #define ULLONG_MAX    18446744073709551615

So char <= short <= int <= long <= long long

and

CHAR_BIT * sizeof (char) >= 8
CHAR_BIT * sizeof (short) >= 16
CHAR_BIT * size of (int) >= 16
CHAR_BIT * sizeof (long) >= 32
CHAR_BIT * sizeof (long long) >= 64



回答3:

Yes, C++ type sizes are inherited from C89.

I can't find the specification right now. But it's in the Bible.



回答4:

Be aware that the guaranteed ranges of these types are one less wide than on most machines:

signed char -127 ... +127 guranteed but most twos complement machines have -128 ... + 127

Likewise for the larger types.



回答5:

There are several inaccuracies in what you read. These inaccuracies were either present in the source, or maybe you remembered it all incorrectly.

Firstly, a pedantic remark about one peculiar difference between C and C++. C language does not make any guarantees about the relative sizes of integer types (in bytes). C language only makes guarantees about their relative ranges. It is true that the range of int is always at least as large as the range of short and so on. However, it is formally allowed by C standard to have sizeof(short) > sizeof(int). In such case the extra bits in short would serve as padding bits, not used for value representation. Obviously, this is something that is merely allowed by the legal language in the standard, not something anyone is likely to encounter in practice.

In C++ on the other hand, the language specification makes guarantees about both the relative ranges and relative sizes of the types, so in C++ in addition to the above range relationship inherited from C it is guaranteed that sizeof(int) is greater or equal than sizeof(short).

Secondly, the C language standard guarantees minimum range for each integer type (these guarantees are present in both C and C++). Knowing the minimum range for the given type, you can always say how many value-forming bits this type is required to have (as minimum number of bits). For example, it is true that type long is required to have at least 32 value-forming bits in order to satisfy its range requirements. If you want to recalculate that into bytes, it will depend on what you understand under the term byte. If you are talking specifically about 8-bit bytes, then indeed type long will always consist of at least four 8-bit bytes. However, that does not mean that sizeof(long) is always at least 4, since in C/C++ terminology the term byte refers to char objects. char objects are not limited to 8-bits. It is quite possible to have 32-bit char type in some implementation, meaning that sizeof(long) in C/C++ bytes can legally be 1, for example.



回答6:

The C standard do not explicitly say that long has to be at least 4 bytes, but they do specify a minimum range for the different integral types, which implies a minimum size.

For example, the minimum range of an unsigned long is 0 to 4,294,967,295. You need at least 32 bits to represent every single number in that range. So yes, the standard guarantee (indirectly) that a long is at least 32 bits.

C++ inherits the data types from C, so you have to go look at the C standard. The C++ standard actually references to parts of the C standard in this case.



回答7:

Just be careful about the fact that some machines have chars that are more than 8 bits. For example, IIRC on the TI C5x, a long is 32 bits, but sizeof(long)==2 because chars, shorts and ints are all 16 bits with sizeof(char)==1.