When the code below is run against a 16-bit integer machine like MSP430 micro controller, s32
yields 65446
#include <stdint.h>
uint16_t u16c;
int32_t s32;
int main()
{
u16c = 100U;
s32 = 10 - u16c;
}
My understanding is that 10 - u16c
gets implicit type promotion to unsigned int. Mathematically 10 - u16c
equals to -90. But how is it possible to represent a negative number as an unsigned int?
When -90 gets promoted to unsigned int, does it mean that the sign of a number is ignored?
Lets suppose, the sign of a number is ignored.
The binary representation of 90 is 00000000 01011010
.
When this gets assigned to s32
which is 32-bit wide signed integer variable,
how does the transformation take place?
In order for s32
equal to 65446, 90 has to take 2's complement.
That would be 00000000 10100110
.
I am not confident in understand the process of s32
becoming 65446.
In a 32-bit wide integer machine like ARM CORTEX, s32
is -90, which is correct.
To fix this situation in 16-bit integer machine, there needs a typecast of (int16_t)
for u16c
. How does this remedy this problem?
Added hexa data representation of s32
as shown from IAR Workbench (Lower right corner).
It is shown that s32
becomes 0x0000FFA6
.
So for MSP430, the machine implementation of converting from unsigned 16 bit to signed 32 bit, it simply prepends 16 0's bits.
My understanding is that 10-u16c
gets implicit type promotion to unsigned int
.
This depends upon the representation of the type of 10
(int
, as it were). Your understanding is correct for some systems, and we'll cover that first, but as you'll see later on in this answer, you're missing a big part of the picture.
Section 5.2.4, Environmental limits specifies that values of type int
can range from -32767 to 32767; this range may be extended at the discretion of implementations, but int
values must be able to represent this range.
uint16_t
, however, if it exists (it's not required to) has a range from 0 to 65535. Implementations can't extend that; it's a requirement that the range be precisely [0..65535]
(hence the reason this type isn't required to exist).
Section 6.3.1.3, Signed and unsigned integers tells us about the conversions to and fro. I couldn't paraphrase it better, so here's a direct quote:
1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.60)
3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
This all supports your theory that the int
value 10 would get converted to a uint16_t
if and only if int
is a sixteen bit type. However, section 6.3.1.8, usual arithmetic conversion rules should be applied first to decide which of the three above conversions takes place, as these rules change the way you'll look at the conversion when int
is greater than sixteen bits:
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.
So, as you can see from this, the type of the expression 10-u16c
might vary from system to system. On systems where int
is sixteen bits, that expression will be a uint16_t
.
Mathematically 10-u16c equals to -90. But how is it possible to represent a negative number as an unsigned int. When -90 gets promoted to unsigned int, does it mean that the sign of a number is ignored?
According to Annex H.2.2:
C's unsigned integer types are ''modulo'' in the LIA-1 sense in that overflows or out-of-bounds results silently wrap.
In other words, if 10
gets converted to a uint16_t
and the subtraction is performed, the result will be a large number, in this case you can see that number by explicitly converting both operands (i.e. casting them) to a uint16_t
. You could see a similar effect by using unsigned integer constants such as -90U
. This is largely supported by rule #2 from the quote from 6.3.1.3 earlier.
When this gets assigned to s32 which is 32 bit wide signed integer variable, how does the transformation takes place?
The expression 10-u16c
is converted according to rule #1 in 6.3.1.3 (quoted above) to an int32_t
value and stored as that value.
To fix this situation in 16bit integer machine, there needs a typecast of (int16_t)
for u16c
. How does this remedy this problem?
The typecast adds no useful information to this discussion. Perhaps you're using a non-compliant (buggy) compiler. I suspect the manual might shed some light on this, but since I don't know which compiler you're using I can't read it...
100 = 0x0064
0x000A - 0x0064 =
0x000A + 0xFF9B + 1 =
0xFFA6 = 65446.
Note that none of the above is either signed nor unsigned, addition and subtraction are blind to such things. now that the 16 bit math is done it can be promoted to 0xFFFFFF9B with the sign extension. In both cases 0xFF9B and 0xFFFFFF9B the answer is -90 if you interpret those bits as signed, if you interpret those bits as unsigned one is 65446 the other 4294967206.
take this:
#include <stdio.h>
int main()
{
unsigned int ra,rb;
ra=0x00000005;
for(rb=0;rb<10;rb++)
{
ra--;
printf("0x%08X\n",ra);
}
return(0);
}
you get this
0x00000004
0x00000003
0x00000002
0x00000001
0x00000000
0xFFFFFFFF
0xFFFFFFFE
0xFFFFFFFD
0xFFFFFFFC
0xFFFFFFFB
which is exactly what you would expect, you subtract one from all zeros and you get all ones, has nothing to do with signed or unsigned. And subtracting 100 from 10 is like doing that loop 100 times.
Were you expecting to see:
0x00000004
0x00000003
0x00000002
0x00000001
0x00000000
0x00000000
0x00000000
0x00000000
0x00000000
0x00000000
for the above program? would that be accurate? would that make sense? no.
The only curious part of your code is this:
s32 = 0xFFA6;
Now actually the folks in the comments can jump right in, but does the standard say that your u16c gets converted from unsigned (0x0064) to signed (0x0064) or does it remain unsigned and the 10 (0x000A) is considered to be unsigned? Basically do we get 0x000A - 0x0064 = 0xFFA6 as signed math or unsigned (my guess is unsigned since the one thing that is declared in that operation is an unsigned). Then that unsigned bit pattern gets promoted to signed, you take a 16 bit bit pattern and sign extend it to 32 bits 0xFFA6 becomes 0xFFFFFFA6, which is what I get on a desktop linux machine with gcc...
Short answer:
- The type of
s32
is irrelevant for the calculation.
- The integer constant
10
is of type int
.
- If
int
is 16 bits, then no integer promotion takes place. The usual arithmetic conversions convert the 10
operand to type uint16_t
(unsigned int). The operation is carried out on 16 bit unsigned type.
- Unsigned wrap-around gives
10u - 100u = 65445u
. = 0xFFA5. This result can fit inside a int32_t
. Two's complement does not apply since the types involved in the operation are unsigned.
- If
int
is 32 bits, then the parameter u16c
is integer promoted to type int
. No further artithmetic conversions take place since both operands are of type int
after integer promotion. The operation is carried out on 32 bit signed type, the result is -90
.
Portable code should be written either as:
10 - (int32_t)u16c; // signed arithmetic intended
or as
10u - u16c; // unsigned wrap-around intended