As the title says, I get a "weird" result when running the following code:
#include <stdio.h>
int main()
{
char buff[4] = {0x17, 0x89, 0x39, 0x40};
unsigned int* ptr = (unsigned int*)buff;
char a = (char)((*ptr << (0*8)) >> (3*8));
char b = (char)((*ptr << (1*8)) >> (3*8));
char c = (char)((*ptr << (2*8)) >> (3*8));
char d = (char)((*ptr << (3*8)) >> (3*8));
printf("0x%x\n", *ptr);
printf("0x%x\n", a);
printf("0x%x\n", b);
printf("0x%x\n", c);
printf("0x%x\n", d);
return 0;
}
Output:
0x40398917
0x40
0x39
0xffffff89
0x17
Why am I not getting 0x89
?
It's because your char
variables are signed and they're undergoing sign extension when being promoted (upgraded to a wider type in this case). Sign extension is a way of preserving the sign when doing this promotion, so that -119
stays as -119
whether it's 8-bit, 16-bit or a wider type.
You can fix it by explicitly using unsigned char
since, in C at least, whether char
is signed or unsigned is implementation-specific. From C11 6.2.5 Types /15
:
The implementation shall define char to have the same range, representation, and behavior as either signed char or unsigned char.
Sign extension does not come into play for unsigned types because they're, ... well, unsigned :-)
char, by default, is signed - this means that numbers run from -128 to 127. Any number outside of that doesn't fit. If you changed char
to unsigned char
, you will get the numbers you expect.
Use memcpy
not a cast
char buff[4] = {0x17, 0x89, 0x39, 0x40};
unsigned int* ptr = (unsigned int*)buff;
This is not correct: buff
does not point to an int object or array, so the cast (unsigned int*)buff
is not defined.
The safe way to reinterpret buff
as an unsigned int
is with memcpy
:
char buff[4] = {0x17, 0x89, 0x39, 0x40};
unsigned int ui;
assert (sizeof ui == sizeof buff);
memcpy (buff, &ui, sizeof ui);
When using memcpy
, you have no make sure the bit representation you copy is valid for the destination type, of course.
One portable but degenerate way to do that is to check that the representation matches an existing object (beware, the following is silly code):
char *null_ptr = 0;
char null_bytes[sizeof null_ptr] = {0};
if (memcmp (null_ptr, null_bytes, sizeof null_bytes)==0) {
char *ptr2;
memcpy (null_bytes, ptr2, sizeof null_bytes);
assert (ptr2 == 0);
}
This code uses memcpy
and has fully defined behavior (even if useless). OTOH, the behavior of
int *ptr3 = (int*)null_bytes;
is not defined, because null_bytes
is not the address of an int
or unsigned int
.