-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why oneof is not wrapped in option? #227
Comments
I haven't looked much at the proto2 side of things (to me it's just
deprecated).
The proto3 guide doesn't have a notion of optional, so my understanding
is that the first variant's [default
value](https://protobuf.dev/programming-guides/proto3/#default) would be
used. See https://protobuf.dev/programming-guides/proto3/#oneof .
I've also seen proto3 files that explicitly keep a dummy variant as the
first case of a `oneof` for this reason.
|
Sadly proto2 is still around a lot of places and is unlikely to disappear. All custom options are defined in proto2 as extensions to I'm trying an ugly hack of exporting I'm reading the encoding section of the docs, which seems to apply both for proto2 and proto3, and as I cited above, it says that one-of does not have any special rules for encoding on the wire. The proto3 docs link that you shared stresses Golang language guide on oneof officially encodes non-set one-ofs as nils and suggests to check that to figure out if it was set or not. Java language guide on oneof adds a special value If wire encoding of one-of is exactly the same as individual fields inside, and it's up to the higher level to decide on the API, and both Golang and Java offer an official way to learn that oneof was not specified on the wire, why can't we do this in an idiomatic way with OCaml implementation by wrapping oneofs into options? |
Both Go and Java have implicit nullability, which OCaml doesn't have. But I hear your point. Looking at opentelemetry's proto3 files, I see at least one Another important quirk of ocaml-protoc is that, with proto3, you don't know which fields have been set. I'm not sure how it's done in, say, Go, but it is related. I can imagine us having a CLI flag that makes optionality more explicit (adding options to fields, or adding a 0th-case |
Okay, let's look at protobuf implementation for a decent language wich lacks implicit nullability :) Will Rust work for this purpose? Quote from prost documentation on oneof-fields: "oneof fields are always wrapped in an Option." Java adds additional enum variant to specify that oneof was not actually set on the wire, this has nothing to do with implicit nullability I believe. And for Go we have official Google documentation that explicitly states how implicit nullability is used to encode lacking oneof fields. And for Rust implementation we have a clear "always wrapped in an Option".
As far as I know, messages are encoded as a list of key value pairs on the wire. Oneof fields do not change wire encoding, so members of oneof are encoded as ordinary fields within enclosing message. The only difference is that decoder needs to apply "last write wins" semantic to figure out which oneof member is actually set - they share the same memory in target language representation (e.g. encoded as a union in C), so we have to know exactly which of oneof members was actually set. Defaults are typically not sent on the wire with protobuf, right, but proto 3 added In generated Golang protobuf code for oneofs, there is a clear way to specify which oneof member is set and what is the value, even for primitive types like integers, as protobuf generator wraps oneof members with structs that implement interfaces so that implicit nullability can be leveraged to mimic algebraic data type semantic - know which case is set and have only one such case be set at the same time.
You mean making option-wrapping behavior for oneof fields an opt-in, controlled by command line flag? That could be a start, but it seems that current behavior is not aligned with other implementations. We could do a test probably, encoding some oneof with no fields with somewhat official implementation like Golang, decoding it back in Golang and with code generated by ocaml-protoc. If semantics will differ, as in: Golang would decode as "oneof not set" and ocaml-protoc would decode as "first constructor with all-defaults", we can probably conclude that ocaml-protoc has incompatibility with official tooling. |
I think you're right overall, an option would be the more correct behavior. It does make the types more annoying to manipulate (since everything becomes an option, except for repeated fields), but if it's the price to pay for being compliant (and sending less data over the wire, too) then so be it.
Yes, the main thing with this is that this change would prompt another major release. Why not, I suppose, but it'd be nice to have at least one transition release where this is opt-in instead of opt-out (or no opt-out at all). |
Sounds good to me!
Regarding annoying types, my prototype for message validation is already
capable of removing options for fields that are marked as required. I'm
positive this should be the way to make this ergonomic for the end users.
…On Tue, Jan 16, 2024, 22:21 Simon Cruanes ***@***.***> wrote:
I think you're right overall, an option would be the more correct
behavior. It does make the types more annoying to manipulate (since
everything becomes an option, except for repeated fields), but if it's the
price to pay for being compliant (and sending less data over the wire, too)
then so be it.
You mean making option-wrapping behavior for oneof fields an opt-in,
controlled by command line flag? That could be a start, but it seems that
current behavior is not aligned with other implementations.
Yes, the main thing with this is that this change would prompt another
major release. Why not, I suppose, but it'd be nice to have at least one
transition release where this is opt-in instead of opt-out (or no opt-out
at all).
—
Reply to this email directly, view it on GitHub
<#227 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAZXBE2ICHUHNPC4BFZKWDYO3AJRAVCNFSM6AAAAABBWXR3DCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJUGI4DQMBUGM>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
This is so absurd though, they removed I've been thinking that maybe the mutable messages should be public (and always with |
That's an interesting idea. I'm so far generating a new set of types with
'_validated' suffixes, and functions which validate original types and
produce new ones (mostly no-op copy of values). This way validation can be
used as opt-in on top of xore types generated by ocaml-protoc.
…On Tue, Jan 16, 2024, 22:37 Simon Cruanes ***@***.***> wrote:
This is so absurd though, they removed required and now it's back in the
form of an option 😅 . Oh well, why not, if we assume that validation
constraints are *always* checked before returning a decoded message.
I've been thinking that maybe the mutable messages should be public (and
always with option in fields, to reflect what does on the wire). The
non-mutable ones can skip options if validation permits it. My other use
case for exposing mutable messages is to reduce allocations when writing
messages (if the mutable message is reused).
—
Reply to this email directly, view it on GitHub
<#227 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAZXBCF724XJPHOM34MYJ3YO3CHBAVCNFSM6AAAAABBWXR3DCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJUGMYDSNZWHE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
In a way it's the same idea, but where the current type of non-validated messages becomes validated (with additional constraints including non nullability); and the current mutable messages become public and are not validated. |
According to [1]:
Link from quote above leads to another page [2], that tells us that:
It seems that runtimes for other languages allow to check if oneof was actually set (i.e. there was at least one field from oneof on the wire).
ocaml-protoc seems to generate the variant for oneof, and then embed it directly into message type without any option. When parsing this, it defaults to first constructor with all-default content inside... Doesn't it make sense to wrap that in option, and if none of oneof fields actually arrive on the wire - pass that to application as None to conveniently handle this case instead of matching first constructor with default values and assuming it was "not sent"?
[1] https://protobuf.dev/programming-guides/encoding/#oneofs
[2] https://protobuf.dev/programming-guides/proto2#oneof
The text was updated successfully, but these errors were encountered: