I have a "User
" class with 40+ private variables including complex objects like private/public keys (QCA library), custom QObjects etc. The idea is that the class has a function called sign()
which encrypts, signs, serializes itself and returns a QByteArray
which can then be stored in a SQLite blob.
What's the best approach to serialize a complex object? Iterating though the properties with QMetaObject
? Converting it to a protobuf object?
Could it be casted to a char array?
No, because you'd be casting
QObject
's internals that you know nothing about, pointers that are not valid the second time you run your program, etc.TL;DR: Implementing it manually is OK for explicit data elements, and leveraging metaobject system for
QObject
andQ_GADGET
classes will help some of the drudgery.The simplest solution might be to implement
QDataStream
operators for the object and the types you use. Make sure to follow good practice: each class that could conceivably ever change the format of data it holds must emit a format identifier.For example, let's take the following classes:
The
Q_DECLARE_METATYPE
macro makes the classes known to theQVariant
and theQMetaType
type system. Thus, for example, it's possible to assign anAddress
to aQVariant
, convert such aQVariant
toAddress
, to stream the variant directly to a datastream, etc.First, let's address how to dump the
QObject
properties:In general, if we were to deal with data from a
User
that didn't have them_props
member, we'd need to be able to clear the properties. This idiom will come up every time you extend the stored object and upgrade the serialization format.Now we know how to restore the properties from a stream:
We can thus implement the stream operators to serialize our objects:
The property system will also work for any other class, as long as you declare its properties and add the
Q_GADGET
macro (instead ofQ_OBJECT
). This is supported from Qt 5.5 onwards.Suppose that we declared our
Address
class as follows:Let's then declare the datastream operators in terms of
[dump|clear|load]Properties
modified for dealing with gadgets:We do not need to change the format designator even if the property set has been changed. We should retain the format designator in case we had other changes that couldn't be expressed as a simple property dump anymore. This is unlikely in most cases, but one must remember that a decision not to use a format specifier immediately sets the format of the streamed data in stone. It's not subsequently possible to change it!
Finally, the property handlers are slightly cut-down and modified variants of the ones used for the
QObject
properties:TODO An issue that was not addressed in the
loadProperties
implementations is to clear the properties that are present in the object but not present in the serialization.It is very important to establish how the entire data stream is versioned when it comes to the internal version of
QDataStream
formats. The documentation is a required reading.One also has to decide how is the compatibility handled between the versions of the software. There are several approaches:
(Most typical and unfortunate) No compatiblity: No format information is stored. New members are added to the serialization in an ad-hoc fashion. Older versions of the software will exhibit undefined behavior when faced with newer data. Newer versions will do the same with older data.
Backward compatibility: Format information is stored in the serialization of each custom type. New versions can properly deal with older versions of the data. Older versions must detect an unhandled format, abort deserialization, and indicate an error to the user. Ignoring newer formats leads to undefined behavior.
Full backward-and-forward compatibility: Each serialized custom type is stored in a
QByteArray
or a similar container. By doing this, you have information on how long the data record for the entire type is. TheQDataStream
version must be fixed. To read a custom type, its byte array is read first, then aQBuffer
is set up that you use aQDataStream
to read from. You read the elements you can parse in the formats you know of, and ignore the rest of the data. This forces an incremental approach to formats, where a newer format can only append elements over an existing format. But, if a newer format abandons some data element from an older format, it must still dump it, but with a null or otherwise safe default value that keeps the older versions of your code "happy".If you think that the format bytes may ever run out, you can employ a variable-length encoding scheme, known as extension or extended octets, familiar across various ITU standards (e.g. Q.931 4.5.5 Bearer Capability information element). The idea is as follows: the highest bit of an octet (byte) is used to indicate whether the value needs more octets for representation. This makes the byte have 7 bits to represent the value, and 1 bit to mark extension. If the bit is set, you read the subsequent octets and concatenate them in little-endian fashion to the existing value. Here is how you might do this:
The serialization of
VarLengthInt
has variable length and always uses the minimum number of bytes possible for a given value: 1 byte up to 0x7F, 2 bytes up to 0x3FFF, 3 bytes up to 0x1F'FFFF, 4 bytes up to 0x0FFF'FFFF, etc. Apostrophes are valid in C++14 integer literals.It would be used as follows:
Binary dump serialization is a bad idea, it will include a lot of stuff you don't need like the object's v-table pointer, as well as other pointers, contained directly or from other class members, which make no sense to be serialized, since they do not persist between application sessions.
If it is just a single class, just implement it by hand, it certainly won't kill you. If you have a family of classes, and they are
QObject
derived, you could use the meta system, but that will only register properties, whereas aint something
member which is not tied to a property will be skipped. If you have a lot of data members which are not Qt properties, it will take you more typing to expose them as Qt properties, unnecessarily I might add, than it would take to write the serialization method by hand.