One often needs to read from memory one byte at a time, like in this naive memcpy()
implementation:
void *memcpy(void *dest, const void *src, size_t n)
{
char *from = (char *)src;
char *to = (char *)dest;
while(n--) *to++ = *from++;
return dest;
}
However, I sometimes see people explicitly use unsigned char *
instead of just char *
.
Of course, char
and unsigned char
may not be equal. But does it make a difference whether I use char *
, signed char *
, or unsigned char *
when bytewise reading/writing memory?
UPDATE: Actually, I'm fully aware that c=200
may have different values depending on the type of c
. What I am asking here is why people sometimes use unsigned char *
instead of just char *
when reading memory, e.g. in order to store an uint32_t
in a char[4]
.
You should use unsigned char
. The C99 standard says that unsigned char
is the only type guaranteed to be dense (no padding bits), and also defines that you may copy any object (except bitfields) exactly by copying it into an unsigned char
array, which is the object representation in bytes.
The sensible interepretation of this is to me, that if you use a pointer to access an object as bytes, you should use unsigned char
.
Reference: http://blackshell.com/~msmud/cstd.html#6.2.6.1 (From C1x draft C99)
This is one point where C++ differs from C. Generally speaking, C only
guarantees that raw memory access works for unsigned char
; char
may
be signed, and on a 1's complement or signed magnitude machine, a -0
might be converted to +0 automatically, changing the bit pattern. For
some reason (unknown to me), the C++ committee extends the guarantees
supporting transparent copy (no change in bit patterns) to char
, as
well as unsigned char
; on a 1's complement or signed magnitude
machine, the implementors have no choice but to make plain char
unsigned, in order to avoid such side effects. (And of course, most
programmers today aren't concerned by such machines anyway.)
Anyway, the end result is that older programmers, who come from a C
background (and maybe have actually worked on a 1's complement or a
signed magnitude machine) will automatically use unsigned char
. It's
also a frequent convention to reserve plain char
for character data
uniquely, with signed char
for very small integral values, and
unsigned char
for raw memory, or when bit manipulation is intended.
Such a rule allows the reader to distinguish between different uses
(provided it is followed religiously).
In your code example it makes no difference. But if you want to display/print the value of the byte than it does (as the highest bit is interpreted differently), and unsigned char
seems more suitable
It depends on what you want to store in the char.
A signed char gives you a range from -127 to 127 whereas an unsigned char ranges from 0 to 255.
For pointer arithmetic it doesn't matter.
#include<stdio.h>
#include<string.h>
int main()
{
unsigned char a[4]={254,254,254,'\0'};
unsigned char b[4];
char c[4];
memset(b,0,4);
memset(c,0,4);
memcpy(b,a,4);
memcpy(c,a,4);
int i;
for(i=0;i<4;i++)
{
printf("\noriginal is %d",a[i]);
printf("\nchar %d is %d",i,c[i]);
printf("\nunsigned char %d is %d \n\n",i,b[i]);
}
}
output is
original is 254
char 0 is -2
unsigned char 0 is 254
original is 254
char 1 is -2
unsigned char 1 is 254
original is 254
char 2 is -2
unsigned char 2 is 254
original is 0
char 3 is 0
unsigned char 3 is 0
so here char and unsign both have the same value so it doesnt matter in this case
Edit
if you read anything as signed char still in that case most highre bit will also going to copy so it doesnt matter