In C, strings are arrays of char (char *
) and characters are usually stored in char
. I noticed that some functions from the libC are taking as argument integers instead of a char.
For instance, let's take the functions toupper()
and tolower()
that both use int
. The man page says:
If c is not an unsigned char value, or EOF, the behavior of these functions is undefined.
My guess is that with a int
, toupper
and tolower
are able to deal with unsigned char
and EOF
. But in fact EOF
is in practice (is there any rule about its value?) a value that can be stored with a char
, and since those functions won't transform EOF
into something else, I'm wondering why toupper
does not simply take a char as argument.
In any case why do we need to accept something that is not a character (such as EOF)? Could someone provide me a relevant use case?
This is similar with fputc
or putchar
, that also take a int
that is converted into an unsigned char
anyway.
I am looking for the precise motivations for that choice. I want to be convinced, I don't want to answer that I don't know if someone ask me one day.
C11 7.4
C11 7.21.1
The C standard explicitly states that EOF is always an int with negative value. And furthermore, the signedness of the default
char
type is implementation-defined, so it may be unsigned and not able to store a negative value:C11 6.2.5
If c is not an unsigned char value, or EOF, the behavior of these functions is undefined.
But
EOF
is a negativeint
in C and some platforms (hi ARM!) havechar
the same asunsigned char
.BITD a coding method included:
ch
with the value of EOF then could be used in various functions likeisalpha()
,tolower()
.This style caused problems with
putchar(EOF)
which I suspect did the same asputchar(255)
.The method is discouraged today for various reasons. Various models like the following are preferred.