Unexpected repitition using fgets and sscanf

2019-07-24 02:26发布

问题:

Here is a part of my code. The aim of gets and sscanf is to scan three variables separated by exactly one space. If passed, then output the instruction again. Otherwise, output error and exit program.

I want to use 7 length char array to limit the number in the line, getting format like 'g 3 3' only. But it seems something wrong in my code.

#include <stdio.h> 

int main (void) {
    char line[7];
    char command;
    int x, y;

    while(1){
        /* problem: g  4 4 or g 4  4 can also pass */
        fgets(line, 7, stdin);
        nargs = sscanf(line, "\n%c %d %d", &command, &x, &y);

        if(nargs != 3){
          printf("error\n");
          return 0;
        }

        printf("%c %d %d\n", command, x, y);
    }
}

Unexpected:

g  4 4
g 4 4
error

expected:

g 4 4
g 4 4
// I can continue type

Can anyone tell me why it will still repeat the instruction?

回答1:

According to the C11 standard, 7.21.6.2p5:

A directive composed of white-space character(s) is executed by reading input up to the first non-white-space character (which remains unread), or until no more characters can be read.

This describes the \n directive and the two space characters as being identical in functionality: They'll match as much consecutive white-space (spaces, tabs, newlines, etc) as they can from the input.

If you want to match a single space (and only a single space), I suggest using %*1[ ] instead of the white-space directives. You could use %*1[\n] to similarly discard a newline. For example, since the newline character appears at the end of a line:

nargs = sscanf(line, "%c%*1[ ]%d%*1[ ]%d%*1[\n]", &command, &x, &y);

This won't completely solve your problem, unfortunately, as the %d format specifier is also defined to discard white-space characters:

Input white-space characters (as specified by the isspace function) are skipped, unless the specification includes a [, c, or n specifier

With some clever hacks, you might be able to continue using sscanf (or better yet, scanf without the intermediate buffer), but after comparing the alternatives in terms of cost on maintainability, we might as well just use getchar, so if you're looking for a solution to your problem as opposed to an answer to the question you posed, I'd recommend gsamaras answer.



回答2:

What you have there won't work, since sscanf() won't be bothered if the user inputs one or two whitespaces.

You could approach this in a simple way, by taking advantage of short circuiting and by using getchar(), like this:

#include <stdio.h>
#include <ctype.h>

#define SIZE 100

int main(void) {
    int c, i = 0;
    char line[SIZE] = {0};
    while ((c = getchar()) != EOF) {
        // is the first char an actual character?
        if(i == 0 && !isalpha(c)) {
                printf("error\n");
                return -1;
        // do I have two whitespaces in 2nd and 4th position?
        } else if((i == 1 || i == 3) && c != ' ') {
                printf("error\n");
                return -1;
        // do I have digits in 3rd and 5th position?
        } else if((i == 2 || i == 4) && !isdigit(c)) {
                printf("error\n");
                return -1;
        // I expect that the user hits enter after inputing his command
        } else if(i == 5 && c != '\n') {
                printf("error\n");
                return -1;
        // everything went fine, I am done with the input, print it
        } else if(i == 5) {
                printf("%s\n", line);
        }
        line[i++] = c;
        if(i == 6)
                i = 0;
    }
    return 0;
}

Output:

gsamaras@gsamaras:~$ gcc -Wall px.c
gsamaras@gsamaras:~$ ./a.out 
g 4 4
g 4 4
g  4 4
error


回答3:

Can anyone tell me why it will still repeat the instruction?

The tricky part is that "%d" consumes leading white-space, so code needs to detect leading white-space first.

" " consumes 0 or more white-space and never fails.

So "\n%c %d %d" does not well detect the number of intervening spaces.


If the ints can be more than 1 character, use this, else see below simplification.

Use "%n to detect location in the buffer of sscanf() progress.

It gets the job done using sscanf() which apparently is required.

// No need for a tiny buffer
char line[80];
if (fgets(line, sizeof line, stdin) == NULL) Handle_EOF();

int n[6];
n[5] = 0;
#define SPACE1 "%n%*1[ ] %n"
#define EOL1   "%n%*1[\n] %n"

// Return value not checked as following `if()` is sufficient to detect scan completion.
// See below comments for details
sscanf(line, "%c" SPACE1 "%d" SPACE1 "%d" EOL1, 
  &command, &n[0], &n[1],
  &x,       &n[2], &n[3],
  &y,       &n[4], &n[5]);

// If scan completed to the end with no extra
if (n[5] && line[n[5]] == '\0') {
  // Only 1 character between?
  if ((n[1] - n[0]) == 1 && (n[3] - n[2]) == 1 && (n[5] - n[4]) == 1) {
    Success(command, x, y);
  }
}

Maybe add test to insure command is not a whitespace, but I think that will happen anyway in command processing.


A simplification can be had if the ints must only be 1 digit and with a mod combining @Seb answer with the above. This works because the length of each field is fixed in an acceptable answer.

// Scan 1 and only 1 space
#define SPACE1 "%*1[ ]"

int n = 0;
// Return value not checked as following `if()` is sufficient to detect scan completion.
sscanf(line, "%c" SPACE1 "%d" SPACE1 "%d" "%n", &command, &x, &y, &n);

// Adjust this to accept a final \n or not as desired.
if ((n == 5 && (line[n] == '\n' || line[n] == '\0')) {
  Success(command, x, y);
}

@Seb and I dove into the need for checking the return value of sscanf(). Although the cnt == 3 test is redundant since n == 5 will only be true when then entire line was scanned and sscanf() returns 3, a number of code checkers may raise a flag noting that the results of sscanf() is not checked. Not qualifying the results of sscanf() before using the saved variables is not robust code. This approach uses a simple and sufficient check of n == 5. Since many code problems stem from not doing any qualification, the lack of the check of the sscanf() can raise a false-positive amongst code checkers. Easy enough to add the redundant check.

// sscanf(line, "%c" SPACE1 "%d" SPACE1 "%d" "%n", &command, &x, &y, &n);
// if (n == 5 && (line[n] == '\n' || line[n] == '\0')) {
int cnt = sscanf(line, "%c" SPACE1 "%d" SPACE1 "%d" "%n", &command, &x, &y, &n);
if (cnt == 3 && n == 5 && (line[n] == '\n' || line[n] == '\0')) {


回答4:

you have a problem with program ? gdb is your best friend =)

gcc -g yourProgram.c
gdb ./a.out
break fgets
run
finish
g 4  4

and then step through the statements, whenever you encounter scanf or printf just type finish, what you will see is that the program completed this iteration successfully but then the program did not wait for input and just printed error message ? why ? well type :

man fgets

fgets reads at most ONE LESS than size, so in your case, fgets is only allowed to read 6 characters, but you gave it 7! Yes the newline is a character just like the space, so what happens to the 7th ? it will be buffered, which means that instead of reading from the keyboard, your program will see that there are characters in the buffer and will use them( one character in this example ). Edit : Here is what you can do to make your program work
you can ignore empty lines, if ( strccmp(line, "\n") == 0 ) then jump to the next iteration, and if you are not allowed to use strcmp a workaround would be comparing line[0]=='\n'.



标签: c fgets scanf