I have snprintf
and it can avoid a buffer overflow, but why there is no function called snscanf
?
Code:
int main()
{
char * src = "helloeveryone";
char buf1[5];
sscanf(src,"%s",buf1); // here is a array out of bounds
}
So, I think a snscanf
is also needed. Why do we have only have snprintf
?
a little more wrinkles. the 'n' usually refers to the first argument in the snprintf. Now, it is true that the first string argument in sscanf is not written to. However, it is read. Thus, the following could segfault:
because stepping one char beyond s could inadvertently step into reading from undefined memory (or continue the integer from another variable). so, something like this would be useful:
s is not a string, of course, but it is a character array. the 'n' in the snscanf would prevent overstepping (reading from) the first (source string) argument, and not be related to the destination argument.
the way to avoid this is to first make sure that s is terminated by a '\0' within 2 characters. you can't use strlen, of course. you need strnlen, and a test whether it is less than 2. if it is 2, then more copying effort is needed first.
Why don't you try
fgets()
(with the standard input filestdin
)?fgets()
lets you to specify the maximum size for your buffer.Thus, you can write this code:
fgets()
reads at most MAXBUFF characters fromstdin
,which is the standard input (that means: the keyboard).
The result is held in the array
buffer
.If a '\n' character is found, the reading stops and '\n' is also held in
buffer
(as the last character). In addition, always a '\0' is added at the end ofbuffer
, so enough storage is needed.You can use a combination of
fgets()
followed bysscanf()
in order to process the string:Thus, you have a "safe"
scanf()
-like method.Note: This approach has a potencial problem. If
fgets()
reachs MAXBUFF characters before the end-of-line character '\n' is obtained, the rest of the input will not be discarded, and it will be taken as part of the next keyboard reading.Hence, one has to add a flush mechanism, that actually is very simple:
However: If you just add that last piece of code after the
fgets()
line,the user will be forced two press ENTER two times each time (s)he enters less than MAXBUFF characters. Worst: this is the most typical situation!
To fix this new problem, observe that an easy logical condition completeley equivalent to the fact that the character '\n' was not reached, is the following:
(buffer[MAXBUFF - 1] != '\0') && (buffer[MAXBUFF - 1] != '\n')
(Prove it!)
Thus, we write:
A final touch is needed: since the array buffer could have garbadge,
it seems that some kind of initialization is needed.
However, let us observe that only the position
[MAXBUFF - 1]
has to be cleaned:char buffer[MAXBUFF + 1] = { [MAXBUFF - 1] = '\0' }; /* ISO C99 syntax */
Finally, we can gather all that facts in a quick macro, like this program shows:
It has been used the mechanism of variable number of parameters in a macro,
under the ISO C99 norms: Variadic macros
__VA_ARGS__
replaces the variable list of parameters.(We need variable number of parameters in order to mimic the
scanf()
-like behaviour.)Notes: The macro-body was enclosed inside a block with { }. This is not completely satisfactory, and it is easily improved, but it is part of another topic...
In particular, the macro
safe_scanf()
does not "return" a value (it is not an expression, but a block statement).Remark: Inside the macro I have declared an array
buffer
which is created at the time of entering the block, and then is destroyed when the block is exited. The scope ofbuffer
is limited to the block of the macro.How to use sscanf correctly and safely
Note that fnprintf is not alone, and most array functions have a secure variation.
There's no need for an
snscanf()
because there's no writing to the first buffer argument. The buffer length insnprintf()
specifies the size of the buffer where the writing goes to:The buffer in the corresponding position for
sscanf()
is a null-terminated string; there's no need for an explicit length as you aren't going to write to it (it's aconst char * restrict buffer
in C99 and C11).In the output, you are already expected to specify the length of the strings (though you're probably in the majority if you use
%s
rather than%99s
or whatever is strictly appropriate):It would be nice/useful if you could use
%*s
as you can withsnprintf()
, but you can't — insscanf()
, the*
means 'do not assign scanned value', not the length. Note that you wouldn't writesnscanf(src, sizeof(buf1), "%s", buf1)
, not least because you can have multiple%s
conversion specifications in a single call. Writingsnscanf(src, sizeof(buf1), sizeof(buf2), "%s %s", buf1, buf2)
makes no sense, not least because it leaves an insoluble problem in parsing the varargs list. It would be nice to have a notation such assnscanf(src, "%@s %@s", sizeof(buf1), buf1, sizeof(buf2), buf2)
to obviate the need to specify the field size (minus one) in the format string. Unfortunately, you can't do that withsscanf()
et al now.Annex K of ISO/IEC 9899:2011 (previously TR24731) provides
sscanf_s()
, which does take lengths for character strings, and which might be used as:(Thanks to R.. for reminding me of this theoretical option — theoretically because only Microsoft has implemented the 'safe' functions, and they did not implement them exactly as the standard requires.)
Note that §K.3.3 Common definitions
<stddef.h>
says: '... The type isrsize_t
which is the typesize_t
.385)' (and footnote 385 says: 'See the description of the RSIZE_MAX macro in<stdint.h>.
' That means that in fact you can passsize_t
without needing a cast — as long as the value passed is within the range defined byRSIZE_MAX
in<stdint.h>
. (The general intention is thatRSIZE_MAX
is a largish number but smaller thanSIZE_MAX
. For more details, read the 2011 standard, or get TR 24731 from the Open Standards web site.)The controversial (and optional) Annex K to C11 adds a
sscanf_s
function which takes an additional argument of typersize_t
(also defined in Annex K) after the pointer argument, specifying the size of the pointed-to array. For better or worse, these functions are not widely supported. You can achieve the same results by putting the size in the conversion specifier, e.g.but this is awkward and error-prone if the size of the destination object may vary at runtime (you would have to construct the conversion specifier programmatically with
snprintf
). Note that the field width in the conversion specifier is the maximum number of input characters to read, andsscanf
also writes a terminating null byte for%s
conversions, so the field width you pass must be strictly less than the size of the destination object.In
sscanf(s, format, ...)
, the the array of characters scanned is aconst char *
. There is no writing tos
. The scanning stops whens[i]
is NUL. Little need for ann
parameter as an auxiliary limit to the scan.In
sprintf(s, format, ...)
, the arrays
is a destination.snprintf(s, n, format, ...)
insures that data is not wriiten tos[n]
and beyond.What would be useful is a flag extension to
sscanf()
conversion specifiers so a limit could easily specified at compile time. (It can be done in a cumbersome fashion today, below, with a dynamic format or withsscanf(src,"%4s",buf1)
.)Here
!
would tellsscanf()
to read asize_t
variable for the size limit the upcoming string. Maybe in C17?Cumbersome method that works today.