Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Struct Metadata Support to Serializer/Deserializer Traits #2835

Open
michaelvanstraten opened this issue Oct 17, 2024 · 0 comments
Open

Comments

@michaelvanstraten
Copy link

Problem Statement

Certain serialization formats require additional metadata about struct fields beyond their type and name to correctly serialize and deserialize data.

A prominent example is Google's Protocol Buffers (protobuf), where each field in a message is identified by a unique, pre-determined number called a field tag. Furthermore, protobuf supports different encoding options for fields, such as variable-length, fixed-size, or length-delimited encoding for integers and other data types.

Currently, the Serializer and Deserializer traits in Serde do not provide a mechanism to pass such additional metadata to the serialization/deserialization implementation. This limitation makes it challenging to implement serializers for formats like protobuf using Serde.

To illustrate this issue, consider the following modified example inspired by the prost crate documentation:

use prost::{Enumeration, Message};

#[derive(Clone, PartialEq, Message)]
struct Person {
    #[prost(string, tag = "1")]
    pub id: String,
    // NOTE: Skipping to less commonly occurring fields
    #[prost(string, tag = "16")]
    pub name: String,
}

In this example, each field has an associated tag attribute that specifies its unique identifier in the protobuf message as per the protobuf message structure. When attempting to serialize or deserialize this struct using a Serde-based protobuf serializer/deserializer, there is no straightforward way to pass the tag information to the Serializer or Deserializer, as Serde's traits do not accommodate such metadata.

There are a few potential workarounds for this limitation, such as:

  1. Phantom Fields: Introducing phantom fields that signal to the serializer, by their field names, that it should update its internal state.
  2. Thread-local Storage: Using thread-local storage to hold the field names, which can be accessed by the serializer/deserializer. Although this approach is quite hacky, it does not alter how other deserializers see the types.
  3. Const Generics and Specialization: Wrapping field types in #[repr(transparent)] structs with const generics, and implementing a trait that uses the min_specialization feature. This approach might look something like:
    impl<T> Trait for T {
        default fn foo() -> Option<Metadata> { None }
    }
    The trait would then only be specialized for the wrapper type.

However, all of these workarounds have downsides, aside from being somewhat hacky. The phantom fields and specialization approach, in particular, actually change how other serializers/deserializers see the types, which introduces additional complexity and potential issues.

Proposed Solution

We could introduce a new, non-required method in the Serializer and Deserializer traits to allow passing additional metadata about struct fields, without breaking backward compatibility.

Here’s an example of how this could look:

pub trait Serializer: Sized {
    ...
    fn serialize_struct_with_metadata(
        self,
        metadata: &'static StructMetadata,
    ) -> Result<Self::SerializeStruct, Self::Error> {
        self.serialize_struct(metadata.name(), metadata.len())
    }
    ...
}

pub trait Deserializer<'de>: Sized {
    ...
    fn deserialize_struct_with_metadata<V>(
        self,
        metadata: &'static StructMetadata,
        visitor: V,
    ) -> Result<V::Value, Self::Error>
    where
        V: Visitor<'de>,
    {
        self.deserialize_struct(metadata.name(), metadata.fields(), visitor)
    }
    ...
}

The StructMetadata struct would hold the original metadata, along with a function that accepts a field name as a key and returns a &'static FieldMetadata struct. The FieldMetadata struct would then have a method that takes a static attribute key and a generic type implementing Deserialize, returning an Option<T>, where T is the expected type of the attribute.

This approach would allow serializers and deserializers to access field-level metadata without altering how existing Serde implementations operate.


Thank you for taking the time to read this proposal and for considering its potential impact on Serde's ecosystem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant