I just realize this 'bug' of scanf now after 8 years with C.
Below scanf code will skip the leading whitespace characters from the second line of input.
int x;
char in[100];
scanf("%d\n",&x);
gets(in);
Input:
1
s
x
will contain 1
, but in
will be just "s"
not " s"
Is this standard C or just gcc behaviour?
A whitespace character in your scanf
format string will cause scanf
to consume any (and all) white space till a non-whitespace char occurs.
This seems to be standard scanf
behaviour and is not limited to gcc.
Its not a Bug in scanf
, the manual of scanf says,
A sequence of white-space characters (space, tab, newline, etc.; see
isspace(3)
). This directive matches any amount of white space,
including none, in the input.
Which means any white space characters with directive as %d\n
will read a number followed by consuming a sequence of white space characters in the input and only returns until you type a non white space character. That how you are able to see only "s"
without a space before it.
The '\n'
(and this is true for any whitespace character in the format string) in
scanf("%d\n", &x);
matches any number of whitespace characters in the input (characters for which isspace
function returns 1, i.e, true, such as newline, space, tab etc.) and not just the newline character '\n'
. This means that scanf
will read all whitespace characters in the input and discard them till it encounters a non-whitespace character. This explains the behaviour you observed.
This is a part of the standard definition of the scanf
function and not a gcc
feature. Also, gets
function is deprecated and unsafe. It does not check for buffer overrun and can lead to bugs and even program crash. In fact, gcc
emits a warning against the use of gets
on my machine. Use of fgets
instead is recommended.
To do what you want, you can do the following:
int x;
char in[100];
scanf("%d", &x);
After scanf
returns successfully, the input stream can contain any sequence of characters terminated by a newline depending on the input given by the user. Get rid of those extraneous characters before reading a string from the stdin.
char ch;
while((ch = getchar()) != '\n' || ch != EOF); // null statement
fgets(in, 100, stdin);
The above fgets
call means that it will read at most 100-1 = 99
(it saves one character space for the terminating null byte which it adds to the buffer being read into before exiting) characters from the stream pointed to by stdin
and store them in the buffer pointed to by in
. fgets
will exit if it encounters EOF
, '\n'
or it has already read 100-1 characters - whichever of the three condition occurs first. If it reads a newline, it will store it into the buffer.
Is the user enters 100 characters or more in this case, then the extraneous characters would be lying around in the input buffer which can mess up with the subsequent input operation of characters or strings by scanf
, fgets
, getchar
etc. calls. You can check for this checking the length of the string in
.
if(strlen(in) > 99) {
// extraneous chars lying around in the input buffer
// read and discard them
char ch;
while((ch = getchar()) != '\n' || ch != EOF); // null statement
}