I'm working on an implementation of the tail function and I'm only supposed to use read()
, write()
and lseek()
for I/O, and so far I have this:
int printFileLines(int fileDesc)
{
char c;
int lineCount = 0, charCount = 0;
int pos = 0, rState;
while(pos != -1 && lineCount < 10)
{
if((rState = read(fileDesc, &c, 1)) < 0)
{
perror("read:");
}
else if(rState == 0) break;
else
{
if(pos == -1)
{
pos = lseek(fileDesc, 0, SEEK_END);
}
pos--;
pos=lseek(fileDesc, pos, SEEK_SET);
if (c == '\n')
{
lineCount++;
}
charCount++;
}
}
if (lineCount >= 10)
lseek(fileDesc, 2, SEEK_CUR);
else
lseek(fileDesc, 0, SEEK_SET);
char *lines = malloc(charCount - 1 * sizeof(char));
read(fileDesc, lines, charCount);
lines[charCount - 1] = 10;
write(STDOUT_FILENO, lines, charCount);
return 0;
}
So far it works for files that have more than 10 lines, but it brakes when i pass a file with less than 10 lines, it just prints the last line of that file, and I can't get it to work with stdin
.
If someone can give me an idea how to fix this issues, that'd be great :D
The first issue:
If you read a newline here ...
... and then set the position directly to the character preceding that newline ...
and then the
linecount
is>= 10
(the while-loop terminates), then the first char you read is the last char of the line preceding the last newline. The newline itself also isn't part of the last 10 lines, so just skip two chars from the current stream position on:For the second issue:
Lets assume, that the stream offset has reached the beginning of the stream:
The while-condition is still TRUE:
Now a char is read. After this, the file offset is 1 (the second character):
Here, pos drops to -1 and lseek will fail:
Since lseek has failed, the position in the file is now the second character, hence the first character is missing. Fix this by resetting the file offset to the beginning of the file if
pos == -1
after the while-loop:Performance:
This needs very many system-calls. An easy enhancement would be to use the buffered f*-functions:
etc. Additionally, this doesn't need system-specific functions.
Even better would be to read the file backwards chunk by chunk and operate on these chunks, but this needs some more coding effort.
For Unix, you could also
mmap()
the whole file and search backwards in memory for newline characters.