Read from a specific spot in a file C++

2019-05-02 07:04发布

问题:

I have a program in C++ that needs to return a line that a specific word appears in. For instance, if my file looks like this:

the cow jumped over
the moon with the
green cheese in his mouth

and I need to print the line that has "with". All the program gets is the offset from the beginning of the file (in this case 24, since "with" is 24 characters from the beginning of the file).

How do I print the whole line "the moon with the", with just the offset?

Thanks a lot!

回答1:

A good solution is reading the file from the beginning until the desired position (answer by @Chet Simpson). If you want optimization (e.g. very large file, position somewhere in the middle, typical lines rather short), you can read the file backwards. However, this only works with files opened in binary mode (any file on unix-like platforms; open the file with ios_base::binary parameter on Windows).

The algorithm goes as follows:

  • Go back a few bytes in file
  • Read the few bytes
  • If there is an end-of-line there, the rest is easy
  • Otherwise, repeat

Code (tested on Windows):

std::string GetSurroundingLine(std::istream& f, std::istream::pos_type start_pos)
{
    std::istream::pos_type prev_pos = start_pos;
    std::istream::pos_type pos;
    char buffer[40]; // typical line length, so typical iteration count is 1
    std::istream::pos_type size = sizeof(buffer);

    // Look for the beginning of the line that includes the given position
    while (true)
    {
        // Move back 40 bytes from prev_pos
        if (prev_pos < size)
            pos = 0;
        else
            pos = prev_pos - size;
        f.seekg(pos);

        // Read 40 bytes
        f.read(buffer, prev_pos - pos);
        if (!f)
            throw;

        // Look for a newline byte, which terminates previous line
        int eol_pos;
        for (eol_pos = sizeof(buffer) - 1; eol_pos >= 0; --eol_pos)
            if (buffer[eol_pos] == '\n')
                break;

        // If found newline or got to beginning of file - done looking
        if (eol_pos >= 0 || pos == (std::istream::pos_type)0)
        {
            pos += eol_pos + 1;
            break;
        }
    }

    // Position the read pointer
    f.seekg(pos);

    // Read the line
    std::string s;
    std::getline(f, s, '\n');

    return s;
}

Edit: On Windows-like platforms, where end-of-line is marked by \r\n, since you have to use binary mode, the output string will contain the extra character \r (unless there is no end-of-line at end-of-file), which you can throw away.



回答2:

You can do this by reading each line individually and recording the file position before and after the read. Then it's just a simple check to see if the offset of the word falls within the bounds of that line.

#include <iostream>
#include <fstream>
#include <string>

std::string LineFromOffset(
    const std::string &filename,
    std::istream::pos_type targetIndex)
{
    std::ifstream input(filename);

    //  Save the start position of the first line. Should be zero of course.
    std::istream::pos_type  lineStartIndex = input.tellg();

    while(false == input.eof())
    {
        std::string   line;

        std::getline(input, line);

        //  Get the end position of the line
        std::istream::pos_type  lineEndIndex = input.tellg();

        //  If the index of the word we're looking for in the bounds of the
        //  line, return it
        if(targetIndex >= lineStartIndex && targetIndex < lineEndIndex)
        {
            return line;
        }

        // The end of this line is the start of the next one. Set it
        lineStartIndex = lineEndIndex;
    }

    //  Need a better way to indicate failure
    return "";
}

void PrintLineTest()
{
    std::string str = LineFromOffset("test.txt", 24);

    std::cout << str;
}


回答3:

There are functions of each of the operation fopen - open the file

fseek - seek the file to the desired offset

fread - read the amount of bytes you want

fclose - close the file