When is uint8_t ≠ unsigned char?

2019-01-05 02:46发布

According to C and C++, CHAR_BIT >= 8.
But whenever CHAR_BIT > 8, uint8_t can't even be represented as 8 bits.
It must be larger, because CHAR_BIT is the minimum number of bits for any data type on the system.

On what kind of a system can uint8_t be legally defined to be a type other than unsigned char?

(If the answer is different for C and C++ then I'd like to know both.)

3条回答
欢心
2楼-- · 2019-01-05 03:13

If it exists, uint8_t must always have the same width as unsigned char. However, it need not be the same type; it may be a distinct extended integer type. It also need not have the same representation as unsigned char; for instance, the bits could be interpreted in the opposite order. This is a silly example, but it makes more sense for int8_t, where signed char might be ones complement or sign-magnitude while int8_t is required to be twos complement.

One further "advantage" of using a non-char extended integer type for uint8_t even on "normal" systems is C's aliasing rules. Character types are allowed to alias anything, which prevents the compiler from heavily optimizing functions that use both character pointers and pointers to other types, unless the restrict keyword has been applied well. However, even if uint8_t has the exact same size and representation as unsigned char, if the implementation made it a distinct, non-character type, the aliasing rules would not apply to it, and the compiler could assume that objects of types uint8_t and int, for example, can never alias.

查看更多
你好瞎i
3楼-- · 2019-01-05 03:14

On what kind of a system can uint8_t be legally defined to be a type other than unsigned char?

In summary, uint8_t can only be legally defined on systems where CHAR_BIT is 8. It's an addressable unit with exactly 8 value bits and no padding bits.

In detail, CHAR_BIT defines the width of the smallest addressable units, and uint8_t can't have padding bits; it can only exist when the smallest addressable unit is exactly 8 bits wide. Providing CHAR_BIT is 8, uint8_t can be defined by a type definition for any 8-bit unsigned integer type that has no padding bits.


Here's what the C11 standard draft (n1570.pdf) says:

5.2.4.2.1 Sizes of integer types 1 The values given below shall be replaced by constant expressions suitable for use in #if preprocessing directives. ... Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.

-- number of bits for smallest object that is not a bit-field (byte)
   CHAR_BIT                                            8

Thus the smallest objects must contain exactly CHAR_BIT bits.


6.5.3.4 The sizeof and _Alignof operators

...

4 When sizeof is applied to an operand that has type char, unsigned char, or signed char, (or a qualified version thereof) the result is 1. ...

Thus, those are (some of) the smallest addressable units. Obviously int8_t and uint8_t may also be considered smallest addressable units, providing they exist.

7.20.1.1 Exact-width integer types

1 The typedef name intN_t designates a signed integer type with width N, no padding bits, and a two’s complement representation. Thus, int8_t denotes such a signed integer type with a width of exactly 8 bits.

2 The typedef name uintN_t designates an unsigned integer type with width N and no padding bits. Thus, uint24_t denotes such an unsigned integer type with a width of exactly 24 bits.

3 These types are optional. However, if an implementation provides integer types with widths of 8, 16, 32, or 64 bits, no padding bits, and (for the signed types) that have a two’s complement representation, it shall define the corresponding typedef names.

The emphasis on "These types are optional" is mine. I hope this was helpful :)

查看更多
欢心
4楼-- · 2019-01-05 03:15

A possibility that no one has so far mentioned: if CHAR_BIT==8 and unqualified char is unsigned, which it is in some ABIs, then uint8_t could be a typedef for char instead of unsigned char. This matters at least insofar as it affects overload choice (and its evil twin, name mangling), i.e. if you were to have both foo(char) and foo(unsigned char) in scope, calling foo with an argument of type uint8_t would prefer foo(char) on such a system.

查看更多
登录 后发表回答