What are the trade-offs, advantages and disadvantages of each of these implementations ? Are they any different at all ? What I want achieve is to store a vector of box'es, into a protobuf.
Impl 1 :
package foo;
message Boxes
{
message Box
{ required int32 w = 1;
required int32 h = 2;
}
repeated Box boxes = 1;
}
Impl 2:
package foo;
message Box
{ required int32 w = 1;
required int32 h = 2;
}
message Boxes
{ repeated Box boxes = 1;
}
Impl 3 : Stream multiple of these messages into the same file.
package foo;
message Box
{ required int32 w = 1;
required int32 h = 2;
}
1 & 2 only change where / how the types are declared. The work itself will be identical.
3 is more interesting: you can't just stream
Box
afterBox
afterBox
, because the root object in protobuf is not terminated (to allow concat === merge). If you only writeBox
es, when you deserialize you will have exactly oneBox
with the lastw
andh
that were written. You need to add a length-prefix; you could do that arbitrarily, but: if you happen to choose to "varint"-encode the length, you're close to what therepeated
gives you - except therepeated
also includes a field-header (field 1, type 2 - so binary 1010 = decimal 10) before each "varint" length.If I were you, I'd just use the
repeated
for simplicity. Which of 1 / 2 you choose would depend on personal choice.Marc Gravell answer is certainly correct, but one point he missed is
Most of the time it will not matter wether you use a Repeated or Multiple messages, but if there are millions / billions of box's, memory will be an issue for option's 1 and 2 (Repeated) and option 3 (multiple messages in the file) would be the best to choose.
So in summary:
Personally I would like to see a "standard" Multiple Message format