how do has_field() methods relate to default value

2019-01-24 00:44发布

问题:

I'm trying to determine the relationship between default values and the has_foo() methods that are declared in various programmatic interfaces. In particular, I'm trying to determine under what circumstances (if any) you can "tell the difference" between a field explicitly set to the default value, and an unset value.

  1. If I explicitly set a field (e.g. "Bar.foo") to its default value (e.g., zero), then is Bar::has_foo() guaranteed return true for that data structure? (This appears to be true for the C++ generated code, from a quick inspection, but that doesn't mean it's guaranteed.) If this is true, then it's possible to distinguish between an explicitly set default value and an unset prior to serialization.

  2. If I explicitly set a field to its default value (e.g., zero), and then serialize that object and send it over the wire, will the value be sent or not? If it is not, then clearly any code that receives this object can't distinguish between an explicitly set default value and an unset value. I.e., it won't be possible to distinguish these two cases after serialization -- Bar::has_foo() will return false in both cases.

If it's not possible to tell the difference, what is the recommended technique for encoding a protobuf field if I want to encode a "nullable" optional value? A couple options come to mind, but neither seem great: (a) add an extra boolean field that records whether the field is set or not, or (b) use a "repeated" field even though I semantically want an optional field -- this way I can tell the difference between no value (length-zero list) or a set value (length-one list).

回答1:

The following applies for 'proto2' syntax, not 'proto3' :

The notion of a field being set or not is a core feature of Protobuf. If you set a field to a value (any value), then the corresponding has_xxx method must return true, otherwise you have a bug in the API.

If you do not set a field and then serialize the message, no value is sent for that field. The receiving side will parse the message, discover which values where included, and set the corresponding "has_xxx" values.

Exactly how this is implemented in the wire-format is documented here: http://code.google.com/apis/protocolbuffers/docs/encoding.html. The short version is that message are encoded as a sequence of key-value pairs, and only fields which are explicitly set are included in the encoded message.

Default values only come into play when you attempt to read an unset field.