If I do this in both clang
and Visual Studio
:
unsigned char *a = 0;
char * b = 0;
char x = '3';
a = & x;
b = (unsigned char*) a;
I get the warning that I am trying to convert between signed and unsigned character pointer but the code sure works. Though compiler is saying it for a reason. Can you point out a situation where this can turn into a problem?
To make it very simple because char
represents:
- A single character (
char
, it doesn't matter if signed or not). When you assign a character like 'A'
what you're doing is to write A ASCII code (65) in that memory location.
- A string (when used as array or pointer to a
char
buffer).
- An eight bit number (with or without sign).
Then when you convert a signed byte like -1 to unsigned byte you'll loose information (at least sign but probably number too), that's why you get a warning:
signed char a = -1;
unsigned char b = (unsigned char)a;
if ((int)b == -1)
; // No! Now b is 255!
Value may not be 255 but 1 if your system doesn't represent negative numbers with 2's complement, in that example it doesn't really matter (and I never worked with any system like that but they exist) because the concept is a signed/unsigned conversion may discard information. It doesn't matter if this happens because of an explicit cast or a cast through pointers: bits will represent something else (and result will change according to implementation, environment and actual value).
Note that for C standard char
, signed char
and unsigned char
are formally distinct types. You won't care (and VS will default char
to signed
or unsigned
according to a compiler option but this isn't portable) and you may need casting.
Your code is correct (any type can be aliased by unsigned char
). Also, on 2's complement systems, this alias is the same as the result of a value conversion.
The reverse operation; aliasing unsigned char
by char
is only a problem on esoteric systems that have trap representations for plain char
.
I don't know of any such systems ever existing, although the C standard provides for their existence. Unfortunately a cast is required because of this possibility, which is more annoying than useful IMHO.
The aliasing of unsigned char
by char
is the same as the value conversion on every modern system that I know of (technically implementation-defined, but everyone implements it that the value conversion retains the same representation).
NB. definition of terms, taking for example unsigned char x = 250;
:
- alias
char y = *(char *)&x;
- conversion
char y = x;
The char type can either be signed or unsigned depending on the platform. The code that you write with casting a char type to either unsigned or signed char might work fine within one platform, but not if the data is transferred across operating systems, ETC. See this URL:
http://www.trilithium.com/johan/2005/01/char-types/
Because you can lose some values - look at this:
unsigned char *a = 0;
char b = -3;
a = &b;
printf("%d", *a);
Result: 253
Let me explain this. Just look at ranges:
unsigned char: from 0 to 255
signed char: from -128 to 127
Edited: sorry for mistake, too hot today ;)