What is the best way to read an entire file into a

2018-12-31 04:47发布

How do I read a file into a std::string, i.e., read the whole file at once?

Text or binary mode should be specified by the caller. The solution should be standard-compliant, portable and efficient. It should not needlessly copy the string's data, and it should avoid reallocations of memory while reading the string.

One way to do this would be to stat the filesize, resize the std::string and fread() into the std::string's const_cast<char*>()'ed data(). This requires the std::string's data to be contiguous which is not required by the standard, but it appears to be the case for all known implementations. What is worse, if the file is read in text mode, the std::string's size may not equal the file's size.

A fully correct, standard-compliant and portable solutions could be constructed using std::ifstream's rdbuf() into a std::ostringstream and from there into a std::string. However, this could copy the string data and/or needlessly reallocate memory. Are all relevant standard library implementations smart enough to avoid all unnecessary overhead? Is there another way to do it? Did I miss some hidden Boost function that already provides the desired functionality?

Please show your suggestion how to implement it.

void slurp(std::string& data, bool is_binary)

taking into account the discussion above.

11条回答
旧人旧事旧时光
2楼-- · 2018-12-31 04:51

Never write into the std::string's const char * buffer. Never ever! Doing so is a massive mistake.

Reserve() space for the whole string in your std::string, read chunks from your file of reasonable size into a buffer, and append() it. How large the chunks have to be depends on your input file size. I'm pretty sure all other portable and STL-compliant mechanisms will do the same (yet may look prettier).

查看更多
人气声优
3楼-- · 2018-12-31 04:52

And the fastest (that I know of, discounting memory-mapped files):

std::string str(static_cast<std::stringstream const&>(std::stringstream() << in.rdbuf()).str());

This requires the additional header <sstream> for the string stream. (The static_cast is necessary since operator << returns a plain old ostream& but we know that in reality it’s a stringstream& so the cast is safe.)

Split into multiple lines, moving the temporary into a variable, we get a more readable code:

std::string slurp(std::ifstream& in) {
    std::stringstream sstr;
    sstr << in.rdbuf();
    return sstr.str();
}

Or, once again in a single line:

std::string slurp(std::ifstream& in) {
    return static_cast<std::stringstream const&>(std::stringstream() << in.rdbuf()).str();
}
查看更多
呛了眼睛熬了心
4楼-- · 2018-12-31 04:52

You can use the 'std::getline' function, and specify 'eof' as the delimiter. The resulting code is a little bit obscure though:

std::string data;
std::ifstream in( "test.txt" );
std::getline( in, data, std::string::traits_type::to_char_type( 
                  std::string::traits_type::eof() ) );
查看更多
梦寄多情
5楼-- · 2018-12-31 04:55

What if you are slurping a 11K file, then you have to do it in a series of chunks, so you have to use something like std::vector to slurp it in large chunks of strings.

查看更多
高级女魔头
6楼-- · 2018-12-31 05:02

If you have C++17 (std::filesystem), there is also this way (which gets the file's size through std::filesystem::file_size instead of seekg and tellg):

#include <filesystem>
#include <fstream>
#include <string>

namespace fs = std::filesystem;

std::string readFile(fs::path path)
{
    // Open the stream to 'lock' the file.
    std::ifstream f{ path };

    // Obtain the size of the file.
    const auto sz = fs::file_size(path);

    // Create a buffer.
    std::string result(sz, ' ');

    // Read the whole file into the buffer.
    f.read(result.data(), sz);

    return result;
}

Note: you may need to use <experimental/filesystem> and std::experimental::filesystem if your standard library doesn't yet fully support C++17. You might also need to replace result.data() with &result[0] if it doesn't support non-const std::basic_string data.

查看更多
心情的温度
7楼-- · 2018-12-31 05:03

See this answer on a similar question.

For your convenience, I'm reposting CTT's solution:

string readFile2(const string &fileName)
{
    ifstream ifs(fileName.c_str(), ios::in | ios::binary | ios::ate);

    ifstream::pos_type fileSize = ifs.tellg();
    ifs.seekg(0, ios::beg);

    vector<char> bytes(fileSize);
    ifs.read(bytes.data(), fileSize);

    return string(bytes.data(), fileSize);
}

This solution resulted in about 20% faster execution times than the other answers presented here, when taking the average of 100 runs against the text of Moby Dick (1.3M). Not bad for a portable C++ solution, I would like to see the results of mmap'ing the file ;)

查看更多
登录 后发表回答