I Learned About 2's Complement
and unsigned and signed int. So I Decided to test my knowledge , as far as i know that a negative number is stored in 2's complement
way so that addition and subtraction would not have different algorithm and circuitry would be simple.
Now If I Write
int main()
{
int a = -1 ;
unsigned int b = - 1 ;
printf("%d %u \n %d %u" , a ,a , b, b);
}
Output Comes To Be -1 4294967295 -1 4294967295
. Now , i looked at the bit pattern and various things and then i realized that -1
in 2's complement is 11111111 11111111 11111111 11111111
, so when i interpret it using %d , it gives -1
, but when i interpret using %u
, it treat it as a positive number and so it gives 4294967295
. I Checked the Assembly Of the code is
.LC0:
.string "%d %u \n %d %u"
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], -1
mov DWORD PTR [rbp-8], -1
mov esi, DWORD PTR [rbp-8]
mov ecx, DWORD PTR [rbp-8]
mov edx, DWORD PTR [rbp-4]
mov eax, DWORD PTR [rbp-4]
mov r8d, esi
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
mov eax, 0
leave
ret
Now here -1
is moved to the register both the times in unsigned and signed . What i want to know if reinterpretation is only that matters , then why do we have two types unsigned
and signed
, it is printf
format string %d
and %u
that matters ?
Further what really happens when i assign negative number to a unsigned integer (I learned That The initializer converts this value from int
to unsigned int
. ) but in the assembly code I did not saw such a thing. So what really happens ??
And How does Machine knows when it has to do 2's complement
and when not , does it see the negative sign and performs 2's complement
?
I have read almost every question and answer you could think this question be duplicate of , but I could not find a satisfactory solution.
Both signed and unsigned are pieces of memory and according to operations it matters how they behave.
It doesn't make any difference when adding or subtracting because due to 2-complement the operations are exactly the same.
It matters when we compare two numbers: -1 is lower than 0 while 4294967295 obviously isn't.
About conversion - for the same size it simply takes variable content and moves it to another - so 4294967295 becomes -1. For bigger size it's first signed extended and then content is moves.
How does machine now - according the instruction we use. Machines have either different instructions for comparing signed and unsigned or they provide different flags for it (x86 has Carry for unsigned overflow and Overflow for signed overflow).
Additionally, note that C is relaxed how the signed numbers are stored, they don't have to be 2-complements. But nowadays, all common architectures store the signed like this.
There are a few differences between signed and unsigned types:
The behaviors of the operators <
, <=
, >
, >=
, /
, %
, and >>
are all different when dealing with signed and unsigned numbers.
Compilers are not required to behave predictably if any computation on a signed value exceeds the range of its type. Even when using operators which would behave identically with signed and unsigned values in all defined cases, some compilers will behave in "interesting" fashion. For example, a compiler given x+1 > y
could replace it with x>=y
if x
is signed, but not if x
is unsigned.
As a more interesting example, on a system where "short" is 16 bits and "int" is 32 bits, a compiler given the function:
unsigned mul(unsigned short x, unsigned short y) { return x*y; }
might assume that no situation could ever arise where the product would exceed 2147483647. For example, if it saw the function invoked as unsigned x = mul(y,65535);
and y
was an unsigned short
, it may omit code elsewhere that would only be relevant if y
were greater than 37268.
It seems you seem to have missed the facts that firstly, 0101 = 5 in both signed and unsigned integer values and that secondly, you assigned a negative number to an unsigned int - something your compiler may be smart enough to realise and, therfore, correct to a signed int.
Setting an unsigned int to -5 should technically cause an error because unsigned ints can't store values under 0.
You could understand it better when you try to assign a negative value to a larger sized unsigned integer. Compiler generates the assembly code to do sign extension when transferring small size negative value to larger sized unsigned integer.
see this blog post for assembly level explanation.
Choice of signed integer representation is left to the platform. The representation applies to both negative and non-negative values - for example, if 11012
(-5) is the two's complement of 01012
(5), then 01012
(5) is also the two's complement of 11012
(-5).
The platform may or may not provide separate instructions for operations on signed and unsigned integers. For example, x86 provides different multiplication and division instructions for signed (idiv
and imul
) and unsigned (div
and mul
) integers, but uses the same addition (add
) and subtraction (sub
) instructions for both.
Similarly, x86 provides a single comparison (cmp
) instruction for both signed and unsigned integers.
Arithmetic and comparison operations will set one or more status register flags (carry, overflow, zero, etc.). These can be used differently when dealing with words that are supposed to represent signed vs. unsigned values.
As far as printf
is concerned, you're absolutely correct that the conversion specifier determines whether the bit pattern 0xFFFF
is displayed as -1
or 4294967295
, although remember that if the type of the argument does not match up with what the conversion specifier expects, then the behavior is undefined. Using %u
to display a negative signed int
may or may not give you the expected equivalent unsigned value.