Huh!!How shall I put the whole thing in a clear question!!Let me try:
I know that the files opened using fopen()
are buffered into memory.We use a buffer for efficiency and ease.During a read from the file, the contents of the file are first read to the buffer,and we read from that buffer.Similarly,in a write to the file, the contents are written to the buffer first ,and then to the file.
But what with fseek()
,fsetpos()
and rewind()
dropping the effect of the previous calls to ungetc()
? Can you tell me how it is done?I mean,given we have opened a file for read and it is copied into the buffer.Now using ungetc()
we've changed some characters in the buffer.Here is what I just fail to understand even after much effort:
Here's what said about the
ungetc()
--"A call to fseek, fsetpos or rewind on stream will discard any characters previously put back into it with this function." --How can characters already put into the buffer be discarded?One approach is that the original characters that were removed are "remembered",and each new character that was put in is identified and replaced with original character.But it seems very inefficient.The other option is to load a copy of the original file into buffer and place the file pointer at the intended position.Which approach of these two does fseek, fsetpos or rewind take to discard the characters put usingungetc()
?For text streams,how does the presence of unread characters in the stream,characters that were put in using
ungetc()
, affect the return value offtell()
?My confusion arise from the following line aboutftell()
andungetc()
from this link aboutftell
(SOURCE)
"For text streams, the numerical value may not be meaningful but can still be used to restore the position to the same position later using fseek (if there are characters put back using ungetc still pending of being read, the behavior is undefined)."
- Focusing on the last line of the above paragraph,what has
pending of being read
got to do with a "ungetc()-obtained" character being discarded? Each time we read a character that was put into the stream usingungetc()
,is it discarded after the read?
Lets start from the beginning,
The ungetc() function shall push the byte specified by c (converted to an unsigned char) back onto the input stream pointed to by stream.A character is virtually put back into an input stream, decreasing its internal file position as if a previous getc operation was undone.This only affects further input operations on that stream, and not the content of the physical file associated with it, which is not modified by any calls to this function.
The new position, measured in bytes from the beginning of the file, shall be obtained by adding offset to the position specified by whence. The specified point is the beginning of the file for SEEK_SET, the current value of the file-position indicator for SEEK_CUR, or end-of-file for SEEK_END.fseek either flushes any buffered output before setting the file position or else remembers it so it will be written later in its proper place in the file
The fsetpos() function sets the file position and state indicators for the stream pointed to by stream according to the value of the object pointed to by pos, which must be a value obtained from an earlier call to fgetpos() on the same stream.
The rewind function repositions the file pointer associated with stream to the beginning of the file. A call to rewind is similar to
(void) fseek( stream, 0L, SEEK_SET );
So as you see ungetc(), Pushing back characters doesn't alter the file; only the internal buffering for the stream is affected.so your second comment "The other option is to load a copy of the original file into buffer and place the file pointer at the intended position" is correct.
Now Answering your second question - A successful intervening call (with the stream pointed to by stream) to a file-positioning function discards any pushed-back characters for the stream. The external storage corresponding to the stream is unchanged
A good mental model of the put back character is simply that it's some extra little property which hangs off the
FILE *
object. Imagine you have:Imagine
putback_char
is initialized to the valueEOF
which indicates "there is no putback char", andungetc
simply stores the character to this member.Imagine that every read operation goes through
getc
, and thatgetc
does something like this:The functions which clear the pushback simply assign
EOF
toputback_char
.In other words, the put back character (and only one needs to be supported) can actually be a miniature buffer which is separate from the regular buffering. (Consider that even an unbuffered stream supports
ungetc
: such a stream has to put the byte or character somewhere.)Regarding the position indicator, the C99 standard says this:
So, the www.cplusplus.com reference you're using is incorrect; the behavior of
ftell
is not undefined when there are pending characters pushed back withungetc
. For text streams, the value is unspecified. Accessing an unspecified value isn't undefined behavior, because an unspecified value cannot be a trap representation. The undefined behavior exists for binary streams if a push back occurs at position zero, because the position then becomes indeterminate. Indeterminate means that it's an unspecified value which could be a trap representation. Accessing it could halt the program with an error message, or trigger other behaviors.It's better to get programming language and library specifications from the horse's mouth, rather than from random websites.