I have a base class message
message Animal {
optional string name = 1;
optional int32 age = 2;
}
and the sub-class which extends animal
message Dog{
optional string breed = 1;
}
So while building a dog message , i should be able to set all the fields of Animal. I know the round about way of doing it (declaring all the animal fields once again in dog message)but is it possible simply and effectively using protobuffers?
Also i learnt about extensions and i understood that it is just used to add a new field to the already existing message and so it should not be misconstrued to be the possible solution for achieving inheritance.
Is it possible to achieve the above simple design using protobuffers's extensions?
There are a few different ways to accomplish "inheritance" in Protocol Buffers. Which one you want depends on your use case.
Option 1: Subclass contains superclass
message Animal {
optional string name = 1;
optional int32 age = 2;
}
message Dog {
required Animal animal = 1;
optional string breed = 2;
}
Here, Dog
contains an Animal
, thus contains all the information of Animal
.
This approach works if you do not need to support down-casting. That is, you never have to say "Is this Animal
a Dog
?" So, anything which might need to access the fields of Dog
needs to take a Dog
as its input, not an Animal
. For many use cases, this is fine.
Option 2: Superclass contains all subclasses
message Animal {
optional string name = 1;
optional int32 age = 2;
// Exactly one of these should be filled in, depending on the species.
optional Dog dog = 100;
optional Cat cat = 101;
optional Axolotl axolotl = 102;
// ...
}
In this approach, given an Animal
, you can figure out which animal it is and access the species-specific information. That is, you can down-cast.
This works well if you have a fixed list of "subclasses". Just list all of them, and document that only one of the fields should be filled in. If there are a lot of subclasses, you might want to add an enum field to indicate which one is present, so that you don't have to individually check has_dog()
, has_cat()
, has_mouse()
, ...
Option 3: Extensions
message Animal {
optional string name = 1;
optional int32 age = 2;
extensions 100 to max; // Should contain exactly one "species" extension.
}
message Dog {
optional string breed = 1;
}
extend Animal {
optional Dog animal_dog = 100;
// (The number must be unique among all Animal extensions.)
}
This option is actually semantically identical to option #2! The only difference is that instead of declaring lots of optional fields inside Animal
, you are declaring them as extensions. Each extension effectively adds a new field to Animal
, but you can declare them in other files, so you don't have to have one central list, and other people can add new extensions without editing your code. Since each extension behaves just like a regular field, other than having somewhat-weird syntax for declaring and accessing it, everything behaves the same as with option #2. (In fact, in the example here, the wire encoding would even be the same, since we used 100 as the extension number, and in option 2 we used 100 as the field number.)
This is the trick to understanding extensions. Lots of people get confused because they try to equate "extend" to inheritance in object-oriented languages. Don't do that! Just remember that extensions behave just like fields, and that options 2 and 3 here are effectively the same. It's not inheritance... but it can solve the same problems.