Object serialization in C++

2019-03-03 14:45发布

问题:

I would like to serialize/deserialize some structured data in order to send it over the network via a char* buffer.

More precisely, suppose I have a message of type struct Message.

struct Message {
        Header header;
        Address address;
        size_t size; // size of data part
        char* data;
    } message 

In C, I would use something such as:

  size = sizeof(Header) + sizeof(Address) + sizeof(size_t) + message.size;
  memcpy(buffer, (char *) message, size);

to serialize, and

Message m = (Message) buffer;

to deserialize.

What would be the "right" way to do it in C++. Is it better to define a class rather than a struct. Should I overload some operators? are there alignment issues to consider?

EDIT: thanks for pointing the "char *" problem. The provided C version is incorrect. The data section pointed to by the data field should be copied separately.

回答1:

Actually there are many flavors:

You can boost let it do for you: http://www.boost.org/doc/libs/1_52_0/libs/serialization/doc/tutorial.html

Overloading the stream operators << for serialization and >> for deserialization works well with file and string streams

You could specify a constructor Message (const char*) for constructing from a char*.

I am a fan of static methods for deserialization like:

Message {
  ...
  static bool desirialize (Message& dest, char* source);
}

since you could catch errors directly when deserializing.

And the version you proposed is ok, when applying the modifications in the comments are respected.



回答2:

Why not insert a virtual 'NetworkSerializable' Class into your inheritance tree? A 'void NetSend(fd socket)' method would send stuff, (without exposing any private data), and 'int(bufferClass buffer)' could return -1 if no complete, valid message was deserilalized, or, if a valid message has been assembled, the number of unused chars in 'buffer'.

That encapsulates all the assembly/disassembly protocol state vars and other gunge inside the class, where it belongs. It also allows message/s to be assembled from multiple stream input buffers.

I'm not a fan of static methods. Protocol state data associated with deserialization should be per-instance, (thread-safety).