Accessing field of Protobuf message of unknown typ

2020-03-12 03:23发布

Let's say I have 2 Protobuf-Messages, A and B. Their overall structure is similar, but not identical. So we moved the shared stuff out into a separate message we called Common. This works beautifully.

However, I'm now facing the following problem: A special case exists where I have to process a serialized message, but I don't know whether it's a message of type A or type B. I have a working solution in C++ (shown below), but I failed to find a way to do the same thing in Python.

Example:

// file: Common.proto
// contains some kind of shared struct that is used by all messages:
message Common {
 ...
}

// file: A.proto
import "Common.proto";

message A {
   required int32  FormatVersion             = 1;
   optional bool   SomeFlag [default = true] = 2;
   optional Common CommonSettings            = 3;

   ... A-specific Fields ...
}

// file: B.proto
import "Common.proto";

message B {
   required int32  FormatVersion             = 1;
   optional bool   SomeFlag [default = true] = 2;
   optional Common CommonSettings            = 3;

   ... B-specific Fields ...
}

Working Solution in C++

In C++ I'm using the reflection API to get access to the CommonSettings field like this:

namespace gp = google::protobuf;
...
Common* getCommonBlock(gp::Message* paMessage)
{
   gp::Message* paMessage = new gp::Message();
   gp::FieldDescriptor* paFieldDescriptor = paMessage->GetDescriptor()->FindFieldByNumber(3);
   gp::Reflection* paReflection = paMessage->GetReflection();
   return dynamic_cast<Common&>(paReflection->GetMessage(*paMessage,paFieldDescriptor));
}

The method 'getCommonBlock' uses FindFieldByNumber() to get hold of the descriptor of the field I'm trying to get. Then it uses reflection to fetch the actual data. getCommonBlock can process messages of type A, B or any future type as long as the Common field remains located at index 3.

My Question is: Is there a way to do a similar thing Python? I've been looking at the Protobuf documentation, but couldn't figure out a way to do it.

4条回答
我命由我不由天
2楼-- · 2020-03-12 03:27

One of the advantages of Python over a statically-typed language like C++ is that you don't need to use any special reflection code to get an attribute of an object of unknown type: you just ask the object. The built-in function that does this is getattr, so you can do:

settings_value = getattr(obj, 'CommonSettings')
查看更多
家丑人穷心不美
3楼-- · 2020-03-12 03:38

I had a similar problem.

What I did was to create a new message, with an enum specifying the type:

enum TYPE {
  A = 0;
  B = 1;
}
message Base {
  required TYPE type = 1;
  ... Other common fields ...
}

Then create specific message types:

message A {
  required TYPE type = 1 [default: A];
  ... other A fields ...
}

And:

message B {
  required TYPE type = 1 [default: B];
  ... other B fields ...
}

Be sure to define correctly the 'Base' message, or you won't be binary compatible if you add fields lately (as you will have to shift inheriting message fields too).

That way, you can recive a generic message:

msg = ... receive message from net ...

# detect message type
packet = Base()
packet.ParseFromString(msg)

# check for type
if packet.type == TYPE.A:
    # parse message as appropriate type
    packet = A()
    packet.ParseFromString(msg)
else:
    # this is a B message... or whatever

# ... continue with your business logic ...

Hope this helps.

查看更多
手持菜刀,她持情操
4楼-- · 2020-03-12 03:50

How about "concatenating" two protocol buffers in a header+payload format, e.g. header as the common data follows by either message A or B as suggested by protobuf techniques?

This is how I did it with various types of payload as blob within mqtt message.

查看更多
混吃等死
5楼-- · 2020-03-12 03:53

I know this is an old thread, but I'll respond anyway for posterity:

Firstly, as you know, it's not possible to determine the type of a protocol buffer message purely from its serialized form. The only information in the serialized form you have access to is the field numbers, and their serialized values.

Secondly, the "right" way to do this would be to have a proto that contains both, like

message Parent {
   required int32  FormatVersion             = 1;
   optional bool   SomeFlag [default = true] = 2;
   optional Common CommonSettings            = 3;

   oneof letters_of_alphabet {
      A a_specific = 4;
      B b_specific = 5;
   }
}

This way, there's no ambiguity: you just parse the same proto (Parent) every time.


Anyway, if it's too late to change that, what I recommend you do is define a new message with only the shared fields, like

message Shared {
   required int32  FormatVersion             = 1;
   optional bool   SomeFlag [default = true] = 2;
   optional Common CommonSettings            = 3;
}

You should then be able to pretend that the message (either A or B) is in fact a Shared, and parse it accordingly. The unknown fields will be irrelevant.

查看更多
登录 后发表回答