In order to improve performance reading from a file, I'm trying to read the entire content of a big (several MB) file into memory and then use a istringstream to access the information.
My question is, which is the best way to read this information and "import it" into the string stream? A problem with this approach (see bellow) is that when creating the string stream the buffers gets copied, and memory usage doubles.
#include <fstream>
#include <sstream>
using namespace std;
int main() {
ifstream is;
is.open (sFilename.c_str(), ios::binary );
// get length of file:
is.seekg (0, std::ios::end);
long length = is.tellg();
is.seekg (0, std::ios::beg);
// allocate memory:
char *buffer = new char [length];
// read data as a block:
is.read (buffer,length);
// create string stream of memory contents
// NOTE: this ends up copying the buffer!!!
istringstream iss( string( buffer ) );
// delete temporary buffer
delete [] buffer;
// close filestream
is.close();
/* ==================================
* Use iss to access data
*/
}
Another thing to keep in mind is that file I/O is always going to be the slowest operation. Luc Touraille's solution is correct, but there are other options. Reading the entire file into memory at once will be much faster than separate reads.
std::ifstream
has a methodrdbuf()
, that returns a pointer to afilebuf
. You can then "push" thisfilebuf
into yourstringstream
:EDIT: As Martin York remarks in the comments, this might not be the fastest solution since the
stringstream
'soperator<<
will read the filebuf character by character. You might want to check his answer, where he uses theifstream
'sread
method as you used to do, and then set thestringstream
buffer to point to the previously allocated memory.This seems like premature optimization to me. How much work is being done in the processing. Assuming a modernish desktop/server, and not an embedded system, copying a few MB of data during intialization is fairly cheap, especially compared to reading the file off of disk in the first place. I would stick with what you have, measure the system when it is complete, and the decide if the potential performance gains would be worth it. Of course if memory is tight, this is in an inner loop, or a program that gets called often (like once a second), that changes the balance.
OK. I am not saying this will be quicker than reading from the file
But this is a method where you create the buffer once and after the data is read into the buffer use it directly as the source for stringstream.
N.B.It is worth mentioning that the std::ifstream is buffered. It reads data from the file in (relatively large) chunks. Stream operations are performed against the buffer only returning to the file for another read when more data is needed. So before sucking all data into memory please verify that this is a bottle neck.