Is sscanf considered safe to use?

2019-01-23 14:52发布

问题:

I have vague memories of suggestions that sscanf was bad. I know it won't overflow buffers if I use the field width specifier, so is my memory just playing tricks with me?

回答1:

I think it depends on how you're using it: If you're scanning for something like int, it's fine. If you're scanning for a string, it's not (unless there was a width field I'm forgetting?).


Edit:

It's not always safe for scanning strings.

If your buffer size is a constant, then you can certainly specify it as something like %20s. But if it's not a constant, you need to specify it in the format string, and you'd need to do:

char format[80]; //Make sure this is big enough... kinda painful
sprintf(format, "%%%ds", cchBuffer - 1); //Don't miss the percent signs and - 1!
sscanf(format, input); //Good luck

which is possible but very easy to get wrong, like I did in my previous edit (forgot to take care of the null-terminator). You might even overflow the format string buffer.



回答2:

The reason why sscanf might be considered bad is because it doesnt require you to specify maximum string width for string arguments, which could result in overflows if the input read from the source string is longer. so the precise answer is: it is safe if you specify widths properly in the format string otherwise not.



回答3:

Note that as long as your buffers are at least as long as strlen(input_string)+1, there is no way the %s or %[ specifiers can overflow. You can also use field widths in the specifiers if you want to enforce stricter limits, or you can use %*s and %*[ to suppress assignment and instead use %n before and after to get the offsets in the original string, and then use those to read the resulting sub-string in-place from the input string.



回答4:

Yes it is..if you specify the string width so the are no buffer overflow related problems.

Anyway, like @Mehrdad showed us, there will be possible problems if the buffer size isn't established at compile-time. I suppose that put a limit to the length of a string that can be supplied to sscanf, could eliminate the problem.



回答5:

There is 2 point to take care.

The output buffer[s].

As mention by others if you specify a size smaller or equals to the output buffer size in the format string you are safe.

The input buffer.

Here you need to make sure that it is a null terminate string or that you will not read more than the input buffer size.

If the input string is not null terminated sscanf may read past the boundary of the buffer and crash if the memorie is not allocated.



回答6:

All of the scanf functions have fundamental design flaws, only some of which could be fixed. They should not be used in production code.

  • Numeric conversion has full-on demons-fly-out-of-your-nose undefined behavior if a value overflows the representable range of the variable you're storing the value in. I am not making this up. The C library is allowed to crash your program just because somebody typed too many input digits. Even if it doesn't crash, it's not obliged to do anything sensible. There is no workaround.

  • As pointed out in several other answers, %s is just as dangerous as the infamous gets. It's possible to avoid this by using either the 'm' modifier, or a field width, but you have to remember to do that for every single text field you want to convert, and you have to wire the field widths into the format string -- you can't pass sizeof(buff) as an argument.

  • If the input does not exactly match the format string, sscanf doesn't tell you how many characters into the input buffer it got before it gave up. This means the only practical error-recovery policy is to discard the entire input buffer. This can be OK if you are processing a file that's a simple linear array of records of some sort (e.g. with a CSV file, "skip the malformed line and go on to the next one" is a sensible error recovery policy), but if the input has any more structure than that, you're hosed.

In C, parse jobs that aren't complicated enough to justify using lex and yacc are generally best done either with POSIX regexps (regex.h) or with hand-rolled string parsing. The strto* numeric conversion functions do have well-specified and useful behavior on overflow and do tell you how may characters of input they consumed, and string.h has lots of handy functions for hand-rolled parsers (strchr, strcspn, strsep, etc).