vector serialization

2019-01-28 03:02发布

问题:

I am trying to binary serialize the data of vector. In this sample below I serialize to a string, and then deserialize back to a vector, but do not get the same data I started with. Why is this the case?

vector<size_t> v;
v.push_back(1);
v.push_back(2);
v.push_back(3);

string s((char*)(&v[0]), 3 * sizeof(size_t));

vector<size_t> w(3);
strncpy((char*)(&w[0]), s.c_str(), 3 * sizeof(size_t));

for (size_t i = 0; i < w.size(); ++i) {
    cout << w[i] << endl;
}

I expect to get the output

1  
2
3

but instead get the output

1
0
0

(on gcc-4.5.1)

回答1:

The error is in the call to strncpy. From the linked page:

If the length of src is less than n, strncpy() pads the remainder of dest with null bytes.

So, after the first 0 byte in the serialized data is found the remainder of w's data array is padded with 0s.

To fix this, use a for loop, or std::copy

std::copy( &s[0], 
           &s[0] + v.size() * sizeof(size_t), 
           reinterpret_cast<char *>(w.data()) );

IMO, instead of using std::string as a buffer, just use a char array to hold the serialized data.

Example on ideone



回答2:

strncpy is a giant pile of fail. It will terminate early on your input because the size_t have some zero bytes, which it interprets as the NULL terminator, leaving them as default-constructed 0. If you ran this test on a BE machine, all would be 0. Use std::copy.



回答3:

To serialize this vector into a string, You first want to convert each of the elements of of this vector from an int into a string containing the same the ascii representation of that number, this operation can be called serialization of an int to string.

So for example, assuming an integer is 10 digits we can

// create temporary string to hold each element
char intAsString[10 + 1];

then convert the integer to a string

sprintf(intAsString, "%d", v[0]);

or

itoa( v[0], intAsString, 10 /*decimal number*/ );

You can also make use of the ostringstream and the << operator

if you look at the memory contents of intAsString and v[0], they are very different, the first contains the ascii letters that represent the value of v[0] in the decimal number system(base 10) while v[0] contains the binary representation of the number(because that's how computers store numbers).



回答4:

The safest way would be to just loop through the vector and store the values individually into a char array of size 3*sizeof(size_t). That way you don't have a dependency on the internal structure of the vector class implementation.