可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
For some graphics work I need to read in a large amount of data as quickly as possible and would ideally like to directly read and write the data structures to disk. Basically I have a load of 3d models in various file formats which take too long to load so I want to write them out in their "prepared" format as a cache that will load much faster on subsequent runs of the program.
Is it safe to do it like this?
My worries are around directly reading into the data of the vector? I've removed error checking, hard coded 4 as the size of the int and so on so that i can give a short working example, I know it's bad code, my question really is if it is safe in c++ to read a whole array of structures directly into a vector like this? I believe it to be so, but c++ has so many traps and undefined behavour when you start going low level and dealing directly with raw memory like this.
I realise that number formats and sizes may change across platforms and compilers but this will only even be read and written by the same compiler program to cache data that may be needed on a later run of the same program.
#include <fstream>
#include <vector>
using namespace std;
struct Vertex
{
float x, y, z;
};
typedef vector<Vertex> VertexList;
int main()
{
// Create a list for testing
VertexList list;
Vertex v1 = {1.0f, 2.0f, 3.0f}; list.push_back(v1);
Vertex v2 = {2.0f, 100.0f, 3.0f}; list.push_back(v2);
Vertex v3 = {3.0f, 200.0f, 3.0f}; list.push_back(v3);
Vertex v4 = {4.0f, 300.0f, 3.0f}; list.push_back(v4);
// Write out a list to a disk file
ofstream os ("data.dat", ios::binary);
int size1 = list.size();
os.write((const char*)&size1, 4);
os.write((const char*)&list[0], size1 * sizeof(Vertex));
os.close();
// Read it back in
VertexList list2;
ifstream is("data.dat", ios::binary);
int size2;
is.read((char*)&size2, 4);
list2.resize(size2);
// Is it safe to read a whole array of structures directly into the vector?
is.read((char*)&list2[0], size2 * sizeof(Vertex));
}
回答1:
As Laurynas says, std::vector
is guaranteed to be contiguous, so that should work, but it is potentially non-portable.
On most systems, sizeof(Vertex)
will be 12, but it's not uncommon for the struct to be padded, so that sizeof(Vertex) == 16
. If you were to write the data on one system and then read that file in on another, there's no guarantee that it will work correctly.
回答2:
You might be interested in the Boost.Serialization library. It knows how to save/load STL containers to/from disk, among other things. It might be overkill for your simple example, but it might become more useful if you do other types of serialization in your program.
Here's some sample code that does what you're looking for:
#include <algorithm>
#include <fstream>
#include <vector>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>
#include <boost/serialization/vector.hpp>
using namespace std;
struct Vertex
{
float x, y, z;
};
bool operator==(const Vertex& lhs, const Vertex& rhs)
{
return lhs.x==rhs.x && lhs.y==rhs.y && lhs.z==rhs.z;
}
namespace boost { namespace serialization {
template<class Archive>
void serialize(Archive & ar, Vertex& v, const unsigned int version)
{
ar & v.x; ar & v.y; ar & v.z;
}
} }
typedef vector<Vertex> VertexList;
int main()
{
// Create a list for testing
const Vertex v[] = {
{1.0f, 2.0f, 3.0f},
{2.0f, 100.0f, 3.0f},
{3.0f, 200.0f, 3.0f},
{4.0f, 300.0f, 3.0f}
};
VertexList list(v, v + (sizeof(v) / sizeof(v[0])));
// Write out a list to a disk file
{
ofstream os("data.dat", ios::binary);
boost::archive::binary_oarchive oar(os);
oar << list;
}
// Read it back in
VertexList list2;
{
ifstream is("data.dat", ios::binary);
boost::archive::binary_iarchive iar(is);
iar >> list2;
}
// Check if vertex lists are equal
assert(list == list2);
return 0;
}
Note that I had to implement a serialize
function for your Vertex
in the boost::serialization
namespace. This lets the serialization library know how to serialize Vertex
members.
I've browsed through the boost::binary_oarchive
source code and it seems that it reads/writes the raw vector array data directly from/to the stream buffer. So it should be pretty fast.
回答3:
std::vector
is guaranteed to be continuous in memory, so, yes.
回答4:
I just ran into this exact same problem.
First off, these statements are broken
os.write((const char*)&list[0], size1 * sizeof(Vertex));
is.read((char*)&list2[0], size2 * sizeof(Vertex));
There is other stuff in the Vector data structure, so this will make your new vector get filled up with garbage.
Solution:
When you are writing your vector into a file, don't worry about the size your Vertex class, just directly write the entire vector into memory.
os.write((const char*)&list, sizeof(list));
And then you can read the entire vector into memory at once
is.seekg(0,ifstream::end);
long size2 = is.tellg();
is.seekg(0,ifstream::beg);
list2.resize(size2);
is.read((char*)&list2, size2);
回答5:
Another alternative to explicitly reading and writing your vector<>
from and to a file is to replace the underlying allocator with one that allocates memory from a memory mapped file. This would allow you to avoid an intermediate read/write related copy. However, this approach does have some overhead. Unless your file is very large it may not make sense for your particular case. Profile as usual to determine if this approach is a good fit.
There are also some caveats to this approach that are handled very well by the Boost.Interprocess library. Of particular interest to you may be its allocators and containers.
回答6:
If this is used for caching by the same code, I don't see any problem with this. I've used this same technique on multiple systems without a problem (all Unix based). As an extra precaution, you might want to write a struct with known values at the beginning of the file, and check that it reads ok. You might also want to record the size of the struct in the file. This will save a lot of debugging time in the future if the padding ever changes.