c standard and bitshifts

2019-04-30 09:22发布

问题:

This question was first inspired by the (unexpected) results of this code:

uint16_t   t16 = 0;
uint8_t     t8 = 0x80;
uint8_t t8_res;

t16    = (t8 << 1);
t8_res = (t8 << 1);

printf("t16: %x\n", t16);    // Expect 0, get 0x100
printf(" t8: %x\n", t8_res); // Expect 0, get 0

But it turns out this makes sense:

6.5.7 Bitwise shift operators

Constraints

2 Each of the operands shall have integer type

Thus the originally confused line is equivalent to:

t16 = (uint16_t) (((int) t8) << 1);

A little non-intuitive IMHO, but at least well-defined.

Ok, great, but then we do:

{
uint64_t t64 = 1;
t64 <<= 31;
printf("t64: %lx\n", t64); // Expect 0x80000000, get 0x80000000
t64 <<= 31;
printf("t64: %lx\n", t64); // Expect 0x0, get 0x4000000000000000
}

// edit: following the same literal argument as above, the following should be equivalent:

t64 = (uint64_t) (((int) t64) << 31);

// hence my confusion / expectation [end_edit]

Now, we get the intuitive result, but not what would be derived from my (literal) reading of the standard. When / how does this "further automatic type promotion" take place? Or is there a limitation elsewhere that a type can never be demoted (that would make sense?), in that case, how do the promotion rules apply for:

uint32_t << uint64_t

Since the standard does say both arguments are promoted to int; should both arguments be promoted to the same type here?

// edit:

More specifically, what should the result of:

uint32_t t32 = 1;
uint64_t t64_one = 1;
uint64_t t64_res;

t64_res = t32 << t64_one;

// end edit

The answer to the above question is resolved when we recognize that the spec does not demand a promotion to int specifically, rather to an integer type, which uint64_t qualifies as.

// CLARIFICATION EDIT:

Ok, but now I am confused again. Specifically, if uint8_t is an integer type, then why is it being promoted to int at all? It does not seem to be related to the constant int 1, as the following exercise demonstrates:

{
uint16_t t16 = 0;
uint8_t t8 = 0x80;
uint8_t t8_one = 1;
uint8_t t8_res;

t16 = (t8 << t8_one);
t8_res = (t8 << t8_one);

printf("t16: %x\n", t16);
printf(" t8: %x\n", t8_res);
}

t16: 100
 t8: 0

Why is the (t8 << t8_one) expression being promoted if uint8_t is an integer type?

--

For reference, I'm working from ISO/IEC 9899:TC9, WG14/N1124 May 6, 2005. If that's out of date and someone could also provide a link to a more recent copy, that'd be appreciated as well.

回答1:

The constraint in §6.5.7 that "Each of the operands shall have integer type." is a constraint that means you cannot use the bitwise shift operators on non-integer types like floating point values or pointers. It does not cause the effect you are noting.

The part that does cause the effect is in the next paragraph:

 3. The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand.

The integer promotions are described in §6.3.1.1:

 2. The following may be used in an expression wherever an int or unsigned int may be used:

  • An object or expression with an integer type whose integer conversion rank is less than or equal to the rank of int and unsigned int.
  • A bit-field of type _Bool, int, signed int, or unsigned int.

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

uint8_t has a lesser rank than int, so the value is converted to an int (since we know that an int must be able to represent all the values of uint8_t, given the requirements on the ranges of those two types).

The ranking rules are complex, but they guarantee that a type with a higher rank cannot have a lesser precision. This means, in effect, that types cannot be "demoted" to a type with lesser precision by the integer promotions (it is possible for uint64_t to be promoted to int or unsigned int, but only if the range of the type is at least that of uint64_t).

In the case of uint32_t << uint64_t, the rule that kicks in is "The type of the result is that of the promoted left operand". So we have a few possibilities:

  • If int is at least 33 bits, then uint32_t will be promoted to int and the result will be int;
  • If int is less than 33 bits and unsigned int is at least 32 bits, then uint32_t will be promoted to unsigned int and the result will be unsigned int;
  • If unsigned int is less than 32 bits then uint32_t will be unchanged and the result will be uint32_t.

On today's common desktop and server implementations, int and unsigned int are usually 32 bits, and so the second possibility will occur (uint32_t is promoted to unsigned int). In the past it was common for int / unsigned int to be 16 bits, and the third possibility would occur (uint32_t left unpromoted).

The result of your example:

uint32_t t32 = 1;
uint64_t t64_one = 1;
uint64_t t64_res;

t64_res = t32 << t64_one;

Will be the value 2 stored into t64_res. Note though that this is not affected by the fact that the result of the expression is not uint64_t - and example of an expression that would be affected is:

uint32_t t32 = 0xFF000;
uint64_t t64_shift = 16;
uint64_t t64_res;

t64_res = t32 << t64_shift;

The result here is 0xf0000000.

Note that although the details are fairly intricate, you can boil it all down to a fairly simple rule that you should keep in mind:

In C, arithmetic is never done in types narrower than int / unsigned int.



回答2:

I think the source of your confusion might be that the following two statements are not equivalent:

  • Each of the operands shall have integer type
  • Each of the operands shall have int type

uint64_t is an integer type.



回答3:

You found the wrong rule in the standard :( The relevant is something like "the usual integer type promotions apply". This is what hits you for the first example. If an integer type like uint8_t has a rank that is smaller than int it is promoted to int. uint64_t has not a rank that is smaller than int or unsigned so no promotion is performed and the << operator is applied to the uint64_t variable.

Edit: All integer types smaller than int are promoted for arithmetic. This is just a fact of life :) Whether or not uint32_t is promoted depends on the platform, because it might have the same rank or higher than int (not promoted) or a smaller rank (promoted).

Concerning the << operator the type of the right operand is not really important, what counts for the number of bits is the left one (with the above rules). More important for the right one is its value. It musn't be negative or exceed the width of the (promoted) left operand.