Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to make str -> ron::Value lossless #553

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

wiiznokes
Copy link

@wiiznokes wiiznokes commented Nov 23, 2024

This PR have the same goal of #328 (i just saw it while opening this one)

This PR currently do not even compile. I'm only opening it for visibility/feedback. I think my map module have some potential though.

I came across a potential blocker (serde-rs/serde#2218). The name of a struct need to have a static lifetime somehow. I think i will try to workaround it by leaking memory for now.

I still need to figuring out the de serialization part

@juntyr
Copy link
Member

juntyr commented Nov 23, 2024

What if type (and field) names are stored using a Cow<‘static, str>? On deserialize, we just fill in a String. On serialise, we fail if the Cow is borrowed. Value then gets a convenience method to leak all owned strings (in the future we could also add a method that uses an interner so that no duplicate leaks occur).

@wiiznokes
Copy link
Author

What if type (and field) names are stored using a Cow<‘static, str>? On deserialize, we just fill in a String. On serialise, we fail if the Cow is borrowed. Value then gets a convenience method to leak all owned strings (in the future we could also add a method that uses an interner so that no duplicate leaks occur).

That seems like a nice work around

@juntyr
Copy link
Member

juntyr commented Nov 25, 2024

we also need a named unit variant with a string field (e.g. for unit structs and unit variants)

I was thinking of using Struct(None, empty::map) for "()", and Struct(Some("name"), empty::map) for unit_struct with indent. For enum_variant_unit, we can use Tuple(Some("name"), empty::vec).

No quite - unit structs and unit variants are serialized in RON as "Name", but empty structs / variants are serialized as "Name()". So we would need a separate NamedUnit(Cow<'static, str>) variant to represent them

@juntyr
Copy link
Member

juntyr commented Nov 25, 2024

Another issue is that RON treats structs and enum variants mostly the same, except that structs can be serialized and deserialized without the name. So if we serialize an enum variant into Value and it becomes a struct there and we then serialize it to a string without struct names, we can no longer go back. We also can't just check the serializer type name (so that we would serialize all struct-like types as enums when we're sure we're serializing to RON) since it may be wrapped.

The other option would be to have explicitly distinct variants for enums. Serializing to Value would not be an issue and serializing the Value to a string would also work. When deserializing from a string we'd have to accept that there would now be implementation-defined ambiguity as to whether Name() is parsed as a named struct or a named variant. As long as the original type could deserialize from both variants in Value this might be a workable compromise.

@max-heller
Copy link

If you're interested in other workarounds for the Serializer &'static str issue, I implemented a couple here. The only ones I've found are (1) allocate and leak each unique string (2) reduce structs/enums to maps, which is what serde_json does.

@wiiznokes
Copy link
Author

wiiznokes commented Jan 6, 2025

This is my last version of Value:

pub enum Value {
    Unit,
    Bool(bool),
    Char(char),
    Number(Number),
    String(String),
    Bytes(Vec<u8>),
    Option(Option<Box<Value>>),
    List(Vec<Value>),
    Map(Map<Value>),
    Tuple(Vec<Value>),

    UnitStructOrEnumVariant(Cow<'static, str>),
    UnitEnumVariant(Cow<'static, str>),
    UnitStruct(Cow<'static, str>),

    StructOrEnumVariant(Option<Cow<'static, str>>, Map<Cow<'static, str>>),
    Struct(Option<Cow<'static, str>>, Map<Cow<'static, str>>),
    EnumVariant(Cow<'static, str>, Map<Cow<'static, str>>),

    EnumTuple(Cow<'static, str>, Vec<Value>),
}

It define variant for case when we don't know the exact type of the value. This way, we can provide option for serialization, like don't write the name of struct, while being conservative when we don't know if we can remove it.

@juntyr

@juntyr
Copy link
Member

juntyr commented Jan 6, 2025

Let's go with this Value for now, I think we'll learn more during the implementation of serialising to it and deserialising from it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants