Arduino left shift not working as expected, compil

2019-09-17 01:00发布

问题:

uint32_t a = 0xFF << 8;
uint32_t b = 0xFF;
uint32_t c = b << 8;

I'm compiling for the Uno (1.0.x and 1.5) and it would seem obvious that a and c should be the same value, but they are not... at least not when running on the target. I compile the same code on the host and have no issues.

Right shift works fine, left shift only works when I'm shifting a variable versus a constant.

Can anyone confirm this?

I'm using Visual Micro with VS2013. Compiling with either 1.0.x or 1.5 Arduino results in the same failure.

EDIT:

On the target:

A = 0xFFFFFF00
C = 0x0000FF00

回答1:

The problem is related to the signed/unsigned implicit cast.

With uint32_t a = 0xFF << 8; you mean

  • 0xFF is declared; it is a signed char;
  • There is a << operation, so that variable is converted to int. Since it was a signed char (and so its value was -1) it is padded with 1, to preserve the sign. So the variable is 0xFFFFFFFF;
  • it is shifted, so a = 0xFFFFFF00.

NOTE: this is slightly wrong, see below for the "more correct" version

If you want to reproduce the same behaviour, try this code:

uint32_t a = 0xFF << 8;
uint32_t b = (signed char)0xFF;
uint32_t c = b << 8;

Serial.println(a, HEX);
Serial.println(b, HEX);
Serial.println(c, HEX);

The result is

FFFFFF00
FFFFFFFF
FFFFFF00

Or, in the other way, if you write

uint32_t a = (unsigned)0xFF << 8;

you get that a = 0x0000FF00.

There are just two weird things with the compiler:

  1. uint32_t a = (unsigned char)0xFF << 8; returns a = 0xFFFFFF00
  2. uint32_t a = 0x000000FF << 8; returns a = 0xFFFFFF00 too.

Maybe it's a wrong cast in the compiler....

EDIT:

As phuclv pointed out, the above explanation is slightly wrong. The correct explanation is that, with uint32_t a = 0xFF << 8;, the compiler does this operations:

  • 0xFF is declared; it is an int;
  • There is a << operation, and thus this becomes 0xFF00; it was an int, so it is negative
  • it is then promoted to uint32_t. Since it was negative, 1s are prepended, resulting in a 0xFFFFFF00

The difference with the above explanation is that if you write uint32_t a = 0xFF << 7; you get 0x7F80 rather than 0xFFFFFF80.

This also explains the two "weird" things I wrote in the end of the previous answer.

For reference, in the thread linked in the comment there are some more explanations on how the compiler interpretes literals. Particularly in this answer there is a table with the types the compiler assigns to the literals. In this case (no suffix, hexadecimal value) the compiler assigns this type, according to what is the smallest type that fits the value:

  1. int
  2. unsigned int
  3. long int
  4. unsigned long int
  5. long long int
  6. unsigned long long int

This leads to some more considerations:

  • uint32_t a = 0x7FFF << 8; this means that the literal is interpreted as a signed integer; the promotion to the bigger integer extends the sign, and so the result is 0xFFFFFF00
  • uint32_t b = 0xFFFF << 8; the literal in this case is interpreted as an unsigned integer. The result of the promotion to the 32-bit integer is therefore 0x0000FF00


回答2:

The most important thing here is that in Arduino int is a 16-bit type. That'll explain everything

  1. For uint32_t a = 0xFF << 8: 0xFF is of type int1. 0xFF << 8 results in 0xFF00 which is a signed negative value in 16-bit int2. When assigning the int value to a uint32_t variable again it'll be sign-extended 3 when upcasting, thus the result becomes 0xFFFFFF00U

  2. For the following lines

    uint32_t b = 0xFF;
    uint32_t c = b << 8;
    

    0xFF is positive in 16-bit int, therefore b also contains 0xFF. Then shifting it left 8 bits results in 0x0000FF00, because b << 8 is an uint32_t expression. It's wider than int so there's no promotion to int happening here

Similarly with uint32_t a = (unsigned)0xFF << 8 the output is 0x0000FF00 because the positive 0xFF when converted to unsigned int is still positive. Upcasting unsigned int to uint32_t does a zero extension, but the sign bit is already zero so even if you do int32_t b = 0xFF; uint32_t c = b << 8 the high bits are still zero. Same to the "weird" uint32_t a = 0x000000FF << 8. Instead of (unsigned)0xFF you can just use the exact equivalent version (but shorter) 0xFFU

OTOH if you declare b as uint8_t b = 0xFF or int8_t b = 0xFF then things will be different, integer promotion occurs and the result will be similar to the first line (0xFFFFFF00U). And if you cast 0xFF to signed char like this

uint32_t b = (signed char)0xFF;
uint32_t c = b << 8;

then upon promoting to int it'll be sign-extended to 0xFFFF. Similarly casting it to int32_t or uint32_t will result in a sign-extension from signed char to the 32-bit wide value 0xFFFFFFFF

If you cast to unsigned char like in uint32_t a = (unsigned char)0xFF << 8; instead then the (unsigned char)0xFF will be promoted to int using zero extension4, therefore the result will be exactly the same as uint32_t a = 0xFF << 8;

In summary: When in doubt, consult the standard. The compiler rarely lies to you


1 Type of integer literals not int by default?

The type of an integer constant is the first of the corresponding list in which its value can be represented.

Suffix      Decimal Constant          Octal or Hexadecimal Constant
-------------------------------------------------------------------
none        int                       int
            long int                  unsigned int
            long long int             long int
                                      unsigned long int
                                      long long int
                                      unsigned long long int

2 Strictly speaking shifting into sign bit like that is undefined behavior

  • 1 << 31 produces the error, "The result of the '<<' expression is undefined"
  • Defining (1 << 31) or using 0x80000000? Result is different

3 The rule is to add UINT_MAX + 1

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

  • Signed to unsigned conversion in C - is it always safe?

4A cast will always preserve the input value if the value fits in the target type, so casting a signed type to a wider signed type will be done by a sign-extension, and casting an unsigned type to a wider type will be done by a zero-extension



回答3:

[Credit goes to Mats Petersson]

Using a cast operator to force the compiler to treat the 0xFF as a uint32_t addresses the issue. Seems like the Arduino xcompiler treats constants a little differently since I've never had cast before a shift.

Thanks!