How validate user input when the expected value is

2019-04-13 07:08发布

问题:

I have the following code:

#include <stdio.h>

#define MIN 0
#define MAX 9 

int main()
{
    int n;

    while (1) {
        printf("Enter a number (%d-%d) :", MIN, MAX);
        scanf("%d", &n);

        if (n >= MIN && n <= MAX) {
            printf("Good\n");
        } else {
            printf("Damn you!\n");
            break;
        }
    }

    return 0;
}

The above code works as expected as long as the user inputs an integer value. For example,

$ ./a.out 
Enter a number (0-9) :15
Damn you!
$ ./a.out 
Enter a number (0-9) :5
Good
Enter a number (0-9) :3
Good
Enter a number (0-9) :-1
Damn you!
$ ./a.out 

But, when the user enters any unexpected input (like <up-arrow> - which is ^[[A, or any string like abc or abc def, etc), it fails and goes in to an infinite loop.

$ ./a.out 
Enter a number (0-9) :2
Good
Enter a number (0-9) :^[[A
Good
Enter a number (0-9) :Good
Enter a number (0-9) :Good
Enter a number (0-9) :Good
Enter a number (0-9) :Good
Enter a number (0-9) :Good
Enter a number (0-9) :Good
^C

One thing to note: when the use enters <up-arrow> for the first time, it works as expected! For example,

$ ./a.out 
Enter a number (0-9) :^[[A
Damn you!
$ 

Why is this odd behavior? How should we handle the case where user enters something that is unappropriate?

回答1:

Personally, I advise ditching scanf altogether for interactive user input, especially for numeric input. It just isn't robust enough to handle certain bad cases.

The %d conversion specifier tells scanf to read up to the next non-numeric character (ignoring any leading whitespace). Assume the call

scanf("%d", &val);

If your input stream looks like {'\n', '\t', ' ', '1', '2', '3', '\n'}, scanf will skip over the leading whitespace characters, read and convert "123", and stop at the trailing newline character. The value 123 will be assigned to val, and scanf will return a value of 1, indicating the number of successful assignments.

If your input stream looks like {'a', 'b', 'c', '\n'}, scanf will stop reading at the a, not assign anything to val, and return 0 (indicating no successful assignments).

So far, so good, right? Well, here's an ugly case: suppose your user types in "12w4". You'd probably like to reject this entire input as invalid. Unfortunately, scanf will happily convert and assign the "12" and leave the "w4" in the input stream, fouling up the next read. It will return a 1, indicating a successful assignment.

Here's another ugly case: suppose your user types in an obnoxiously long number, like "1234567890123456789012345678901234567890". Again, you'd probably like to reject this input outright, but scanf will go ahead and convert and assign it, regardless of whether the target data type can represent that value or not.

To properly handle those cases, you need to use a different tool. A better option is to read the input as text using fgets (protecting against buffer overflows), and manually convert the string using strtol. Advantages: you can detect and reject bad strings like "12w4", you can reject inputs that are obviously too long and out of range, and you don't leave any garbage in the input stream. Disadvantages: it's a bit more work.

Here's an example:

#include <string.h>
#include <stdlib.h>
#include <stdio.h>
...
#define DIGITS ... // maximum number of digits for your target data type;
                   // for example, a signed 16-bit integer has up to 5 digits.
#define BUFSIZ (DIGITS)+3 // Account for sign character, newline, and 0 terminator
...
char input[BUFSIZ];

if (!fgets(input, sizeof input, stdin))
{
  // read error on input - panic
  exit(-1);
}
else
{
  /**
   * Make sure the user didn't enter a string longer than the buffer
   * was sized to hold by looking for a newline character.  If a newline 
   * isn't present, we reject the input and read from the stream until
   * we see a newline or get an error.
   */
  if (!strchr(input, '\n'))
  {
    // input too long
    while (fgets(input, sizeof input, stdin) && !strchr(input, '\n'))
    ;
  }
  else
  {
    char *chk;
    int tmp = (int) strtol(input, &chk, 10);

    /**
     * chk points to the first character not converted.  If
     * it's whitespace or 0, then the input string was a valid
     * integer
     */
    if (isspace(*chk) || *chk == 0)
      val = tmp;
    else
      printf("%s is not a valid integer input\n", input);
  }
}


回答2:

My advice would be to check the return value of scanf(). If it is zero, there has been a matching failure (ie the user didn't input an integer).

The reason it is succeeding is because n is not altered by scanf() when the match fails, so the check is performed on an uninitialised 'n'. My advice -there- would be to always initialise everything so that you don't end up getting weird logic results like you have there.

For example:

if (scanf("%d",&n) != 1))
{
  fprintf(stderr,"Input not recognised as an integer, please try again.");
  // anything else you want to do when it fails goes here
}


回答3:

I would use a char buffer to get the input and then convert it to an integer with e.g. atoi. Only problem here is that atoi returns 0 on failure (you can't determine if it's 0 because of failure or because the value is 0).

Also you could just compare the strings with strncmp.

// edit:

As suggested in the comments you can do the check with isdigit() Since I'm a bit in a hurry I couldn't implemented my example in your use case, but I also doubt that this causes any troubles.

Some example code would be:

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>


int main(void)
{
    int x;
    char buf[4] = {0};
    scanf("%s",buf);
    if(isdigit(buf[0]))
    {
        x = atoi(buf);
        if( x > 9)
        {
           // Not okay
        }
        else
        {
          // okay
        }
    }
    else
    {
    // Not okay
    }
    return 0;
}

If the first element of the buffer is not a digit you know its wrong input anyway.

Otherwise you check the value now with atoi and look if its greater than 9. ( You don't need to check the lower value since -1 would already be detected in the isdigt call ( buf[0] would be "-" )



回答4:

I have updated the code as follows (checked scanf() return value) and it works fine.

#include <stdio.h>
#include <errno.h>

#define MIN 0
#define MAX 9 

int main()
{
    int n, i;

    while (1) {
        errno = 0;
        printf("Enter a number (%d-%d) :", MIN, MAX);

        if (scanf("%d", &n) != 1) {
            printf("Damn you!\n");
            break;
        } 

        if (n >= MIN && n <= MAX) {
            printf("Good\n");
        } else {
            printf("Damn you!\n");
            break;
        }
    }

    return 0;
}

The following are few things to note from the scanf() man page!

man scanf

The format string consists of a sequence of directives which describe how to process the sequence of input characters. If processing of a directive fails, no further input is read, and scanf() returns. A "failure" can be either of the following: input failure, meaning that input characters were unavailable, or matching failure, meaning that the input was inappropriate.

RETURN VALUE: scanf return the number of input items successfully matched and assigned, which can be fewer than provided for, or even zero in the event of an early matching failure. The value EOF is returned if the end of input is reached before either the first successful conversion or a matching failure occurs. EOF is also returned if a read error occurs, in which case the error indicator for the stream is set, and errno is set indicate the error.



回答5:

scanf return the number of fields it read, so you can do something like

if (scanf("%d",&n)<1) exit(1)

or even:

while(scanf("%d",&n)!=1);