This answer points out the fact that C++ is not well suited for the iteration over a binary file, but this is what I need right now, in short I need to operate on files in a "binary" way, yes all files are binary even the .txt ones, but I'm writing something that operates on image files, so I need to read files that are well structured, were the data is arranged in a specific way.
I would like to read the entire file in a data structure such as std::vector<T>
so I can almost immediately close the file and work with the content in memory without caring about disk I/O anymore.
Right now, the best way to perform a complete iteration over a file according to the standard library is something along the lines of
std::ifstream ifs(filename, std::ios::binary);
for (std::istreambuf_iterator<char, std::char_traits<char> > it(ifs.rdbuf());
it != std::istreambuf_iterator<char, std::char_traits<char> >(); it++) {
// do something with *it;
}
ifs.close();
or use std::copy
, but even with std::copy
you are always using istreambuf
iterators ( so if I understand the C++ documentation correctly, you are basically reading 1 byte at each call with the previous code ).
So the question is: how do I write a custom iterator ? from where I should inherit from ?
I assume that this is also important while writing a file to disk, and I assume that I could use the same iterator class for writing, if I'm wrong please feel free to correct me.
My suggestion is not to use a custom stream, stream-buffer or stream-iterator.
You could dare to make a stream buffer iterator reading elements having a bigger size than the underlaying char_type:
The state of the stream is not maintained by the buffer or iterator.
It is possible to optimize
std::copy()
usingstd::istreambuf_iterator<char>
but hardly any implementation does. Just deriving from something won't really do the trick either because that isn't how iterators work.The most effective built-in approach is probably to simply dump the file into an
std::ostringstream
and the get astd::string
from there:If you want to avoid travelling through a
std::string
you could write a stream buffer directly dumping the content into a memory area or astd::vector<unsigned char>
and also using the output operation above.The
std::istreambuf_iterator<char>
s could, in principle have a backdoor to the stream buffer's and bypass characterwise operations. Without that backdoor you won't be able to speed up anything using these iterators. You could create an iterator on top of stream buffers using the stream buffer'ssgetn()
to deal with a similar buffer. In that case you'd pretty much need a version ofstd::copy()
dealing with segments (i.e., each fill of a buffer) efficiently. Short of either I'd just read the file into buffer using a stream buffer and iterate over that.