I am using the following hashing function provided in the K&R book.
#define HASHSIZE 101
unsigned hash(char *s)
{
unsigned hashval;
for (hashval = 0; *s != '\0'; s++)
hashval = *s + 31 * hashval;
return hashval % HASHSIZE;
}
In my project, I have more warnings turned on (warnings are treated as errors too) and the above code will fail to compile.
error: conversion to ‘unsigned int’ from ‘char’ may change the sign of the result
If I make the hashval
signed, I am getting negative hash values. I am wondering how this can be fixed.
Any help?
What your compiler is picking up on and warning you about is that you are implicitly changing your interpretation of the bytes stored in the area pointed to by
s
. The function prototype specifiess
as being a pointer to achar
and by default on your setup,char
s seem to be signed. However, to get the has arithmetic correct, you need to use just unsigned values. So the question is this: what should the compiler do with values pointed to throughs
which actually have negative values?Let's take a quick diversion to make sure we understand what values we might be considering. The possible values for a
signed char
areCHAR_MIN
toCHAR_MAX
inclusive. (These values can be found inlimits.h
.) The possible values for anunsigned char
are0
toUCHAR_MAX
inclusive. So the question becomes this: how do we represent the possible range of values fromCHAR_MIN
toCHAR_MAX
within the range0
toUCHAR_MAX
?One simple approach is simply to let the compiler carry out this conversion for you: it simply uses wrap-around arithmetic to ensure that the value is within limits: it automatically adds
UCHAR_MAX + 1
enough times to get a value which is within the range0
toUCHAR_MAX
. However, the actual value of this will be potentially dependent on the compiler which you are using. It is this possibility of non-portability which lies behind your compiler warning.OK, so where does that get us? Well, if you are prepared to take responsibility for the hypothetical portability problems which this approach will produce, you can tell the compiler that you are happy for it to make the conversion using the standard rules. You do this by using a cast:
This approach will suppress the warning and ensure that your arithmetic is all done as unsigned, which is what you want for this sort of has function. However, you need to be aware that the same code on other systems may give different hash results.
An alternative approach is to use the fact that the ANSI C standard specifies that pointers can validly be cast to type
unsigned char *
to access the underlying byte structure of the data being pointed to. (I don't have my copy of the standard to hand at the moment, or I'd give you a reference.) This would allow you to generalise this approach to producing a function which gives you a hash of a value of any data type. (However, to do this you must think about how you know the size of the data being passed in.) This might look something like:I hope this gives you a bit of insight into what's going on.
I think you can safely typecast your char to unsigned: (unsigned char)*s
Change
s
to beunsigned char *
in the function signature, or simply cast when you use it (i.e.(unsigned char *)s
).