How to read / write a struct in Binary Files?

2019-01-10 21:39发布

问题:

I am facing a small problem. I have a struct, which has a vector. Note that the vector is dynamic per every iteration. Now, in a particular iteration, how do I store the struct which contains a vector of size n to a binary file?

Also, when retrieving, assume that I know how the size of the vector, how to I retrieve from the binary file, the struct variable containing the vector of all the stored elements?

I am able to store something to the binary file (as I can see the size increasing when writing), but when I am trying to retrieve back the elements, I am getting size of vector to be zero.

Unfortunately, I have to achieve this using the standard STL and not use any third-party libraries.

回答1:

You should have a look at Boost Serialization.

If you can't use 3rd party libraries, you must know that C++ doesn't support serialization directly. This means you'll have to do it yourself.

This article shows a nice way of serializing a custom object to the disk and retrieving it back. And this tutorial shows you how to get started right now with fstream.

This is my attempt:

EDIT: since the OP asked how to store/retrieve more than record I decided to updated the original code.

So, what changed? Now there's an array student_t apprentice[3]; to store information of 3 students. The entire array is serialized to the disk and then it's all loaded back to the RAM where reading/searching for specific records is possible. Note that this is a very very small file (84 bytes). I do not suggest this approach when searching records on huge files.

#include <fstream>
#include <iostream>
#include <vector>
#include <string.h>

using namespace std;


typedef struct student
{
    char name[10];
    int age;
    vector<int> grades;
}student_t;

int main()
{
    student_t apprentice[3];  
    strcpy(apprentice[0].name, "john");
    apprentice[0].age = 21;
    apprentice[0].grades.push_back(1);
    apprentice[0].grades.push_back(3);
    apprentice[0].grades.push_back(5);    

    strcpy(apprentice[1].name, "jerry");
    apprentice[1].age = 22;
    apprentice[1].grades.push_back(2);
    apprentice[1].grades.push_back(4);
    apprentice[1].grades.push_back(6);

    strcpy(apprentice[2].name, "jimmy");
    apprentice[2].age = 23;
    apprentice[2].grades.push_back(8);
    apprentice[2].grades.push_back(9);
    apprentice[2].grades.push_back(10);

    // Serializing struct to student.data
    ofstream output_file("students.data", ios::binary);
    output_file.write((char*)&apprentice, sizeof(apprentice));
    output_file.close();

    // Reading from it
    ifstream input_file("students.data", ios::binary);
    student_t master[3];
    input_file.read((char*)&master, sizeof(master));         

    for (size_t idx = 0; idx < 3; idx++)
    {
        // If you wanted to search for specific records, 
        // you should do it here! if (idx == 2) ...

        cout << "Record #" << idx << endl;
        cout << "Name: " << master[idx].name << endl;
        cout << "Age: " << master[idx].age << endl;
        cout << "Grades: " << endl;
        for (size_t i = 0; i < master[idx].grades.size(); i++)
           cout << master[idx].grades[i] << " ";
        cout << endl << endl;
    }

    return 0;
}

Outputs:

Record #0
Name: john
Age: 21
Grades: 
1 3 5 

Record #1
Name: jerry
Age: 22
Grades: 
2 4 6 

Record #2
Name: jimmy
Age: 23
Grades: 
8 9 10

Dump of the binary file:

$ hexdump -c students.data 
0000000   j   o   h   n  \0 237   {  \0   �   �   {   � 025  \0  \0  \0
0000010   (   �   �  \b   4   �   �  \b   8   �   �  \b   j   e   r   r
0000020   y  \0   �  \0   �   �   |  \0 026  \0  \0  \0   @   �   �  \b
0000030   L   �   �  \b   P   �   �  \b   j   i   m   m   y  \0  \0  \0
0000040   �   6   �  \0 027  \0  \0  \0   X   �   �  \b   d   �   �  \b
0000050   h   �   �  \b                                                
0000054


回答2:

You typically serialize a vector by writing the length of the vector, followed by that number of elements. When you read it back in, having the length come first lets you know how many more items to read as part of that vector. As a simple first approximation, consider something like this:

template<class T>
std::ostream &operator<<(std::ostream &output, T const &input) {
    T::size_type size = input.size();

    output << size << "\n";
    std::copy(input.begin(), input.end(), 
         std::ostream_iterator<T::value_type>(output, "\n"));

    return output;
}

template<class T>
std::istream &operator>>(std::istream &input, T &output) {
    T::size_type size, i;

    input >> size;
    output.resize(size);
    std::copy_n(
        std::istream_iterator<t::value_type>(input),
        size,
        output.begin());

    return input;
}

This is open to lots of tweaks, improvements, and simple modifications -- just for example, for the moment, I've passed the vector (or whatever -- could be a std::deque, etc.) by reference rather than passing iterators. That probably simplifies most use, but doesn't fit as well with the rest of the library.

This also serializes in text format, one number per line. Discussions comparing text to binary have happened before, so I won't try to repeat all the arguments here -- I'll just note that the same basic idea can be done in binary format just as well as text.