I am trying to learn C on my own and I'm kind of confused with getchar
and putchar
:
1
#include <stdio.h>
int main(void)
{
char c;
printf("Enter characters : ");
while((c = getchar()) != EOF){
putchar(c);
}
return 0;
}
2
#include <stdio.h>
int main(void)
{
int c;
printf("Enter characters : ");
while((c = getchar()) != EOF){
putchar(c);
}
return 0;
}
The C library function int putchar(int c)
writes a character (an unsigned char) specified by the argument char to stdout.
The C library function int getchar(void)
gets a character (an unsigned char) from stdin. This is equivalent to getc with stdin as its argument.
Does it mean putchar()
accepts both int
and char
or either of them and for getchar()
should we use an int
or char
?
TL;DR:
char c; c = getchar();
is wrong, broken and buggy.int c; c = getchar();
is correct.This applies to
getc
andfgetc
as well, if not even more so, because one would often read until the end of the file.Always store the return value of
getchar
(fgetc
,getc
...) (andputchar
) initially into a variable of typeint
.The argument to
putchar
can be any ofint
,char
,signed char
orunsigned char
; its type doesn't matter, and all of them work the same, even though one might result in positive and other in negative integers being passed for characters above and including\200
(128).The reason why you must use
int
to store the return value of bothgetchar
andputchar
is that when the end-of-file condition is reached (or an I/O error occurs), both of them return the value of the macroEOF
which is a negative integer constant, (usually-1
).For
getchar
, if the return value is notEOF
, it is the readunsigned char
zero-extended to anint
. That is, assuming 8-bit characters, the values returned can be0
...255
or the value of the macroEOF
; again assuming 8-bit char, there is no way to squeeze these 257 distinct values into 256 so that each of them could be identified uniquely.Now, if you stored it into
char
instead, the effect would depend on whether the character type is signed or unsigned by default! This varies from compiler to compiler, architecture to architecture. Ifchar
is signed and assumingEOF
is defined as-1
, then bothEOF
and character'\377'
on input would compare equal toEOF
; they'd be sign-extended to(int)-1
.On the other hand, if
char
is unsigned (as it is by default on ARM processors, including Raspberry PI systems; and seems to be true for AIX too), there is no value that could be stored inc
that would compare equal to-1
; includingEOF
; instead of breaking out onEOF
, your code would output a single\377
character.The danger here is that with signed
char
s the code seems to be working correctly even though it is still horribly broken - one of the legal input values is interpreted asEOF
. Furthermore, C89, C99, C11 does not mandate a value forEOF
; it only says thatEOF
is a negative integer constant; thus instead of-1
it could as well be say-224
on a particular implementation, which would cause spaces behave likeEOF
.gcc
has the switch-funsigned-char
which can be used to make thechar
unsigned on those platforms where it defaults to signed:Now we run it with signed
char
:Seems to be working right. But with unsigned
char
:That is, I tried to press
Ctrl-D
there many times but a�
was printed for eachEOF
instead of breaking the loop.Now, again, for the signed
char
case, it cannot distinguish betweenchar
255 andEOF
on Linux, breaking it for binary data and such:Only the first part up to the
\0377
escape was written to stdout.Beware that comparisons between character constants and an
int
containing the unsigned character value might not work as expected (e.g. the character constant'ä'
in ISO 8859-1 would mean the signed value-28
. So assuming that you write code that would read input until'ä'
in ISO 8859-1 codepage, you'd doDue to integer promotion, all
char
values fit into anint
, and are automatically promoted on function calls, thus you can give any ofint
,char
,signed char
orunsigned char
toputchar
as an argument (not to store its return value), and it would work as expected.The actual value passed in the integer might be positive or even negative; for example the character constant
\377
would be negative on a 8-bit-char system wherechar
is signed; howeverputchar
(orfputc
actually) will cast the value to an unsigned char. C11 7.21.7.3p2:(emphasis mine)
I.e. the
fputc
will be guaranteed to convert the givenc
as if by(unsigned char)c
Always use
int
to save character fromgetchar()
asEOF
constant is ofint
type. If you usechar
then the comparison againstEOF
is not correct.You can safely pass
char
toputchar()
though as it will be promoted toint
automatically.Note: Technically using
char
will work in most cases, but then you can't have 0xFF character as they will be interpreted asEOF
due to type conversion. To cover all cases always useint
. As @Ilja put it --int
is needed to represent all 256 possible character values and theEOF
, which is 257 possible values in total, which cannot be stored inchar
type.