How do I read a file into a std::string
, i.e., read the whole file at once?
Text or binary mode should be specified by the caller. The solution should be standard-compliant, portable and efficient. It should not needlessly copy the string's data, and it should avoid reallocations of memory while reading the string.
One way to do this would be to stat the filesize, resize the std::string
and fread()
into the std::string
's const_cast<char*>()
'ed data()
. This requires the std::string
's data to be contiguous which is not required by the standard, but it appears to be the case for all known implementations. What is worse, if the file is read in text mode, the std::string
's size may not equal the file's size.
A fully correct, standard-compliant and portable solutions could be constructed using std::ifstream
's rdbuf()
into a std::ostringstream
and from there into a std::string
. However, this could copy the string data and/or needlessly reallocate memory. Are all relevant standard library implementations smart enough to avoid all unnecessary overhead? Is there another way to do it? Did I miss some hidden Boost function that already provides the desired functionality?
Please show your suggestion how to implement it.
void slurp(std::string& data, bool is_binary)
taking into account the discussion above.
Never write into the std::string's const char * buffer. Never ever! Doing so is a massive mistake.
Reserve() space for the whole string in your std::string, read chunks from your file of reasonable size into a buffer, and append() it. How large the chunks have to be depends on your input file size. I'm pretty sure all other portable and STL-compliant mechanisms will do the same (yet may look prettier).
And the fastest (that I know of, discounting memory-mapped files):
This requires the additional header
<sstream>
for the string stream. (Thestatic_cast
is necessary sinceoperator <<
returns a plain oldostream&
but we know that in reality it’s astringstream&
so the cast is safe.)Split into multiple lines, moving the temporary into a variable, we get a more readable code:
Or, once again in a single line:
You can use the 'std::getline' function, and specify 'eof' as the delimiter. The resulting code is a little bit obscure though:
What if you are slurping a 11K file, then you have to do it in a series of chunks, so you have to use something like std::vector to slurp it in large chunks of strings.
If you have C++17 (std::filesystem), there is also this way (which gets the file's size through
std::filesystem::file_size
instead ofseekg
andtellg
):Note: you may need to use
<experimental/filesystem>
andstd::experimental::filesystem
if your standard library doesn't yet fully support C++17. You might also need to replaceresult.data()
with&result[0]
if it doesn't support non-const std::basic_string data.See this answer on a similar question.
For your convenience, I'm reposting CTT's solution:
This solution resulted in about 20% faster execution times than the other answers presented here, when taking the average of 100 runs against the text of Moby Dick (1.3M). Not bad for a portable C++ solution, I would like to see the results of mmap'ing the file ;)