In an answer there was an interesting statement: "It's almost always a bad idea to use the fscanf()
function as it can leave your file pointer in an unknown location on failure. I prefer to use fgets()
to get each line in and then sscanf()
that."
Could you expand upon when/why it might be better to use fgets()
and sscanf()
to read some file?
You can always use
ftell()
to find out current position in file, and then decide what to do from there. Basicaly, if you know what you can expect then feel free to usefscanf()
.Basically, there's no way to to tell that function not to go out of bounds for the memory area you've allocated for it.
A number of replacements have come out, like fnscanf, which is an attempt to fix those functions by specifying a maximum limit for the reader to write, thus allowing it to not overflow.
There are two reasons:
scanf()
can leavestdin
in a state that's difficult to predict; this makes error recovery difficult if not impossible (this is less of a problem withfscanf()
); andscanf()
family take pointers as arguments, but no length limit, so they can overrun a buffer and alter unrelated variables that happen to be after the buffer, causing seemingly random memory corruption errors that very difficult to understand, find, and debug, particularly for less experienced C programmers.Novice C programmers are often confused about pointers and the “address-of” operator, and frequently omit the
&
where it's needed, or add it “for good measure” where it's not. This causes “random” segfaults that can be hard for them to find. This isn'tscanf()
's fault, so I leave it off my list, but it is worth bearing in mind.After 23 years, I still remember it being a huge pain when I started C programming and didn't know how to recognize and debug these kinds of errors, and (as someone who spent years teaching C to beginners) it's very hard to explain them to a novice who doesn't yet understand pointers and stack.
Anyone who recommends
scanf()
to a novice C programmer should be flogged mercilessly.OK, maybe not mercilessly, but some kind of flogging is definitely in order ;o)
When fscanf() fails, due to an input failure or a matching failure, the file pointer (that is, the position in the file from which the next byte will be read) is left in a position other than where it would be had the fscanf() succeeded. This is typically undesirable in sequential file reads. Reading one line at a time results in the file input being predictable, while single line failures can be handled individually.
Imagine a file with three lines:
Using
fscanf()
to read integers, the first line would read fine but on the second linefscanf()
would leave you at the 'b', not sure what to do from there. You would need some mechanism to move past the garbage input to see the third line.If you do a
fgets()
andsscanf()
, you can guarantee that your file pointer moves a line at a time, which is a little easier to deal with. In general, you should still be looking at the whole string to report any odd characters in it.I prefer the latter approach myself, although I wouldn't agree with the statement that "it's almost always a bad idea to use
fscanf()
"...fscanf()
is perfectly fine for most things.The case where this comes into play is when you match character literals. Suppose you have:
Consider two possible inputs "
323,A424
" and "323A424
".In both cases
fscanf()
will return 1 and the next character read will be an'A'
. There is no way to determine if the comma was matched or not.That being said, this only matters if finding the actual source of the error is important. In cases where knowing there is malformed input error is enough,
fscanf()
is actually superior to writing custom parsing code.