Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow integer tags for internally tagged enums #745

Open
dtolnay opened this issue Feb 3, 2017 · 16 comments · May be fixed by #2525
Open

Allow integer tags for internally tagged enums #745

dtolnay opened this issue Feb 3, 2017 · 16 comments · May be fixed by #2525

Comments

@dtolnay
Copy link
Member

dtolnay commented Feb 3, 2017

See this use case.

Would the internally tagged enum support allow me to handle schema versioning defined like this?

{
    "schema_version": 1,
    ...
}

I've never gotten a good answer about how I'd do that with Serde.

cc @ssokolow

@dtolnay
Copy link
Member Author

dtolnay commented Feb 13, 2017

Attributes support non-string literals now right? This could be as simple as allowing:

#[derive(Serialize, Deserialize)]
#[serde(tag = "schema_version")]
enum E {
    #[serde(rename = 1)]
    V1(...),
    #[serde(rename = 2)]
    V2(...),
}

@dtolnay
Copy link
Member Author

dtolnay commented Apr 15, 2017

Also boolean tags?

#[derive(Serialize, Deserialize)]
#[serde(tag = "error")]
enum Response {
    #[serde(rename = false)]
    Ok(QueryResult),
    #[serde(rename = true)]
    Err(QueryError),
}

@Phaiax
Copy link

Phaiax commented Jun 30, 2017

Any plans to progress on this feature? I could give it a try, but I would need a bit of mentoring / pointing to the right places.

@dtolnay
Copy link
Member Author

dtolnay commented Jun 30, 2017

I have not started working on this. I would love a PR! Happy to provide guidance if you run into any trouble.

@Noxime
Copy link

Noxime commented May 8, 2018

Any updates on this?

@NotBad4U
Copy link

Hi 😄
We need this feature for sozu #240 to handle configuration versioning.

I'd like to implement it. I saw that someone (#973) started working on it but abandoned it.
Can I use it as a starting point or do you recommend another approach ?

@Arnavion
Copy link

@NotBad4U You can use a string tag (the version enum's variant name) for the configuration version.

@dtolnay
Copy link
Member Author

dtolnay commented Sep 14, 2018

@NotBad4U I think #973 is the right approach. Literals in attributes will be stable in rust 1.30 so we can support #[serde(rename = 0)].

@WiSaGaN
Copy link
Contributor

WiSaGaN commented Dec 12, 2019

I guess this is blocked on #1392?

@Ekleog
Copy link

Ekleog commented Apr 10, 2020

For what it's worth, I think there's an additional use case for this (though it's technically not for internally tagged enums, it'd hopefully be fixed the same way): #[repr(i32)] enums and the like.

Right now my solution is https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=9154cf599592144c4473903b57d91abe ; but that's an awful lot of boilerplate for this simple use case :)

@ErichDonGubler
Copy link

@Ekleog: I was able to use the serde_repr crate suggested by official docs to shorten your playground to this:

//! ```cargo
//! [dependencies]
//! serde = "1"
//! serde_repr = "0.1"
//! serde_json = "1"
//! ```

use std::fmt;

#[derive(Copy, Clone, Debug, serde_repr::Serialize_repr, serde_repr::Deserialize_repr)]
#[repr(i32)]
pub enum Test {
    Foo = 0,
    Bar = 2,
}

fn main() {
    println!("{}", serde_json::to_string(&Test::Foo).unwrap());
    println!("{:?}", serde_json::from_str::<Test>("0").unwrap());
}

@Ekleog
Copy link

Ekleog commented Jul 13, 2020

This looks cool! I hadn't seen that in the docs when writing that message. Thank you!

@vallentin
Copy link

This issue came up in a question on Stack Overflow.

For anybody in need of a workaround for integer tags, then I answered with a workaround on Stack Overflow, using a custom serializer and deserializer, by deserializing into a serde_json::Value.

jjbayer added a commit to getsentry/relay that referenced this issue Jan 27, 2023
After deploying #1678, we saw a
rise in memory consumption. We narrowed down the reason to
deserialization of replay recordings, so this PR attempts to replace
those deserializers with more efficient versions that do not parse an
entire `serde_json::Value` to get the tag (`type`, `source`) of the
enum.

A custom deserializer is necessary because serde does not support
[integer tags for internally tagged
enums](serde-rs/serde#745).

- [x] Custom deserializer for `NodeVariant`, based on serde's own
`derive(Deserialize)` of internally tagged enums.
- [x] Custom deserializer for `recording::Event`, based on serde's own
`derive(Deserialize)` of internally tagged enums.
- [x] Custom deserializer for `IncrementalSourceDataVariant`, based on
serde's own `derive(Deserialize)` of internally tagged enums.
- [x] Box all enum variants.

### Benchmark comparison

Ran a criterion benchmark on `rrweb.json`. It does not tell us anything
about memory consumption, but the reduced cpu usage points to simpler
deserialization:

#### Before

```
rrweb/1                 time:   [142.37 ms 148.17 ms 155.61 ms]
```

#### After

```
rrweb/1                 time:   [31.474 ms 31.801 ms 32.137 ms]
```

#skip-changelog

---------

Co-authored-by: Colton Allen <[email protected]>
Co-authored-by: Oleksandr <[email protected]>
@ysndr
Copy link

ysndr commented Mar 1, 2023

I'm still pretty much in need of this...

for now I came up with this approach using const generics.
Putting this here with the hope this might be helpful to others, or someone telling me what is wrong about it:

#[derive(Serialize, Deserialize, Debug)]
#[serde(untagged)]
pub enum Bla {
    V1 {
        hello: String,
        version: Option<Version<1>>,
    },
    V2 {
        foo: String,
        version: Version<2>,
    },
}

#[derive(Debug)]
pub struct Version<const V: u8>;

#[derive(Debug, Error)]
#[error("Invalid version")]
struct VersionError;

impl<const V: u8> Serialize for Version<V> {
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer,
    {
        serializer.serialize_u8(V)
    }
}

impl<'de, const V: u8> Deserialize<'de> for Version<V> {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>,
    {
        let value = u8::deserialize(deserializer)?;
        if value == V {
            Ok(Version::<V>)
        } else {
            Err(serde::de::Error::custom(VersionError))
        }
    }
}

@danielo515
Copy link

It will be cool to have a way to be able to specify a custom deserializer for the key. In my case I have arrays containing a single string in the tag (it's weird, I know ) and I will love to be able to use it to parse my enum directly, without having to use several steps

@Sytten
Copy link

Sytten commented Feb 2, 2024

@ysndr The only downside is that it doesn't fail with a nice error messages since the untagged enum will try other versions if the JSON is invalid but the version is correct. Since we don't have ContentRefDeserializer in the public API it makes it a bit hard to create a custom deserializer for the Bla enum. I guess I am waiting on #2525 to be merged :)

In the meantime here is my solution (with schemars support):

#[derive(Clone, Debug)]
pub struct Edition<const V: u8>;

impl<const V: u8> Edition<V> {
    pub const ERROR: &'static str = "Invalid edition";
}

impl<const V: u8> PartialEq<Edition<V>> for u8 {
    fn eq(&self, _: &Edition<V>) -> bool {
        V == *self
    }
}

impl<const V: u8> Serialize for Edition<V> {
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer,
    {
        serializer.serialize_u8(V)
    }
}

impl<'de, const V: u8> Deserialize<'de> for Edition<V> {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>,
    {
        let value = u8::deserialize(deserializer)?;
        if value == V {
            Ok(Edition::<V>)
        } else {
            Err(serde::de::Error::custom(Self::ERROR))
        }
    }
}

impl<const V: u8> JsonSchema for Edition<V> {
    fn schema_name() -> String {
        "Edition".to_owned()
    }

    fn schema_id() -> Cow<'static, str> {
        Cow::Owned(format!("Edition_{}", V))
    }

    fn json_schema(gen: &mut schemars::gen::SchemaGenerator) -> Schema {
        use schemars::schema::*;

        let mut schema = gen.subschema_for::<u8>();
        if let Schema::Object(schema_object) = &mut schema {
            if schema_object.has_type(InstanceType::Integer)
                || schema_object.has_type(InstanceType::Number)
            {
                let validation = schema_object.number();
                validation.minimum = Some(V as f64);
                validation.maximum = Some(V as f64);
            }
        }
        schema
    }
}

Then you implement a custom deserializer for your untagged enum

#[derive(Serialize)]
#[serde(untagged)]
pub enum MyObject {
    V2(v2::MyObject),
    V1(v1::MyObject),
}

impl<'de> Deserialize<'de> for MyObject {
    fn deserialize<D>(deserializer: D) -> std::result::Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>,
    {
        use serde::__private::de::{Content, ContentRefDeserializer};

        let content = Content::deserialize(deserializer)?;

        match v2::MyObject::deserialize(ContentRefDeserializer::<D::Error>::new(&content))
        {
            Ok(v) => return Ok(ParsableWorkflow::V2(v)),
            Err(e) if e.to_string() != Edition::<2>::ERROR => return Err(e),
            Err(_) => {}
        }

        match v1::MyObject::deserialize(ContentRefDeserializer::<D::Error>::new(&content))
        {
            Ok(v) => return Ok(ParsableWorkflow::V1(v)),
            Err(e) if e.to_string() != Edition::<1>::ERROR => return Err(e),
            Err(_) => {}
        }

        Err(serde::de::Error::custom(
            "data did not match any variant of untagged enum MyObject",
        ))
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment