How to validate that a Protobuf message does not contain enum fields with zero value? Turns out that this isn’t supported directly by Protobuf! We need to look into how
protojson, the package is implemented
More and more companies are adopting gRPC with Protobuf for communication between internal services. It has the benefits of high performance, supporting multiple programming languages, and being backed by Google with a great ecosystem around.
For communication with front-end and external services, Protobuf can be marshaled to JSON format. The browser only understands JSON format, and we can not expect other companies to consume Protobuf directly from us. (Of course, you can, if you are big enough!)
Sample code is written in Go.
From the Protobuf style guide, the zero value enum should have the suffix
UNSPECIFIED. It’s because enum is implemented as a
uint32and the value
0 is considered as, well, unspecified. It’s similar to
nil for a message or an empty string. When encoding Protobuf as JSON, a
nil message, an
UNSPECIFIED enum, or an empty string is ignored.
We were following that convention until someday, we did not.
When sending external webhook messages, we decided to not use
UNSPECIFIED. One reason is that we are using
EmitUnpopulated: true To ensure that all fields are included in the JSON representation when sending webhook messages to external parties. And we don’t want that
UNSPECIFIED value to appear in the webhook messages, if somehow we forget to set an enum field to 0. Unit tests can not catch all the mistakes; we engineers know that.
This causes a lot of trouble, so we had to revert and make the value
UNSPECIFIED again. One problem is that it forces the use of
EmitUnpopulated: true everywhere! And there are places where we don’t want to emit all unpopulated fields. Like calling some third-party APIs. Some messages mix between
UNSPECIFIED enums and non-
UNSPECIFIED enums; There are no ways to send the correct format with that. Use
EmitUnpopulated: truethe third-party APIs don’t understand
EmitUnpopulated: false and some required fields with non-
UNSPECIFIED enums are omitted. Of course, they can all be refactored away, but it should be simpler to just force the use of
UNSPECIFIED at the beginning.
Turn out there are no simple ways to do that in Protobuf 3!
In Protobuf 2, there is
required option to prevent a field to be unset. This option was removed in Protobuf 3, because it prevents refactoring for removing fields. If we forgot to update every service to remove that no-longer-used
required field, especially in a company with multiple teams working together, the messages will be dropped unintentionally. It should be better not to require it upfront. (more)
In Protobuf 3, there was
jsonpb.JSONPBMarshaler interface. We can simply implement that interface for all enums to return error upon seeing a zero value. But again, it was removed! As a protocol, we should minimize the customization as much as possible. Otherwise, that customization will have to be implemented and maintained in all different languages across different teams!
We’ll have to reach the reflection package. The
protoreflect.Message interface has
Range() method for iterating over every populated field. We can use that method to verify that there are no enum fields with zero… Oh, wait. It only iterates over populated fields. So it won’t detect the zero value in enum!
But the function
protojson.Marshal() can still emit unpopulated fields with
EmitUnpopulated option. How does it implement that? Dive into
encoding/protojsonthere is a code snippet for iterating over unpopulated fields (source). Let’s steal it:
What the above code does is iterating over additional fields, by looping over
protoreflect.Message.Descriptor().Fields(). Fields within
oneof fields are skipped. Unpopulated singular
message fields are set as
invalid (think of it as
null in generated JSON) before being sent to the input function.
Still, a bit more code to write, like implementing a traveling method for iterating over all different Protobuf types: message, array (repeated), dynamic Struct, and of course, enum. But it’s solvable. And I can take a rest now.
Thanks for reading!
Want to Connect?Also published at my blog. Follow me on Twitter for more.