today i just lay on my bed meditating programming stuff when an idea flows into my mind which i can't solve with my own ability.Below is the Question.
I read a book that explain why EOF value is -1 and the explanation is as follow :
Why -1?Normally getchar() returns a value in the range 0 through 127 , because those are values corresponding to the standard character set , but it might return values from 0 through 255 if the system recognizes an extended character set . In either case , the value -1 does not correspond to any character , so it can be used to signal the end of file.
1.)It is weird for the statement above because i remember besides signed integer , signed character is also one type of variable exist in C , so that's mean the value from -128 to 127 can be used.But why still the book mention -1 does not contradict to any character use for keyboard input??
The standard doesn't say it's -1
.
EOF
which expands to an integer constant expression, with type int and
a negative value, that is returned by several functions to indicate
end-of-file, that is, no more input from a stream
It's not a value that you can read from a file. It's an "out of band" value that getc
and such can't return in any case other than "end of file".
getchar actually returns an int
which means it has more than 8 bits. If getchar
returns a number between 0 and 255 (inclusive), then you know it returned a valid byte from the file you were reading. If it returns -1 you know you reached the end of the file. Because the value is an int
, -1 is guaranteed to not be equal to any value between 0 and 255.
None of the answers you've gotten has mentioned a point that (at least IMO) is crucial.
When you read a character, the value is read as an unsigned char
, and converted from there to int
(§7.19.7.1/2):
If the end-of-file indicator for the input stream pointed to by stream is not set and a
next character is present, the fgetc function obtains that character as an unsigned
char converted to an int and advances the associated file position indicator for the
stream (if defined).
This particular quote is for fgetc
-- getc
, getchar
, and fread
are all required to read data as if by calling fgetc
.
This does implicitly (at least sort of) assume that char
has a smaller range than int
, so the conversion to int
allows at least one value to be represented that couldn't have come from the file. Elsewhere, the standard denies that as a requirement, but I think that denial is unrealistic -- I'm pretty sure an implementation with char
and int
the same size would break vast amounts of code.
That's why getchar() returns an int. If it returned signed char you wouldn't be able to differentiate between EOF and -1 character (128 from extended set), if it returned unsigned char then -1 wouldn't make any sense. However getchar() returns int and if it's in range 0 to 255 it is a character but if it returns -1 then it's EOF
getchar()
returns an int which contains either a plain (unsigned) char or EOF, so there is no problem.
getchar() returns an int. -1 for EOF, or 0 through 255. the int type is big enough to not have conflicts with those.