Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider reimplement it with a tagged-union version #3

Open
js2xxx opened this issue Apr 1, 2024 · 3 comments
Open

Consider reimplement it with a tagged-union version #3

js2xxx opened this issue Apr 1, 2024 · 3 comments

Comments

@js2xxx
Copy link

js2xxx commented Apr 1, 2024

The version of dedicated tagged unions can benefit more than the current version, which uses Box<dyn Any>. Here's the comparison:

Pros:

  • Doesn't need dynamic allocation & dispatching;
  • Requires looser lifetime bounds to the generics;
  • Allows deriving common traits (Clone, (Partial)Eq, etc.) easily;
  • Allows multiple instances with the same type and different tags (e.g. Sum<(u32, String, u32)>).

Con(s):

  • Requires unsafe code around unions & may cause soundness problems.

I wrote a tagged-union version of this sum type. Check it out here if you're interested. I would also be happy to create a PR and reimplement this crate if needed, although it results in massive code diffs.

@spacejam
Copy link
Member

spacejam commented Apr 1, 2024

Hey :) I started implementing a tagged-union yesterday, but then realized it might be possible to actually do this with actual enums and maaaaybe avoid some unsafe, but I'm less sure about that, where each TypeSet has acorresponding TypeSet::Enum that may be something like E2<A, B>{ A(A), B(B) } with the same arity as the TypeSet. I'm still working through some of the narrow/broaden transformation steps, and I'm not completely sure it's possible yet, and I think it would still require a 'static bound. But in both the enum and tagged union approaches, narrowing and broadening are no longer strictly type-level operations because the size on the stack may need to change, undermining some of the benefits of avoiding boxing. To be honest I'm skeptical that there are very many workloads that really justify the additional complexity on the cold path. But it's a fun problem to reason about!

@spacejam
Copy link
Member

spacejam commented Apr 1, 2024

One of the reasons I'm really curious about the enum-based approach is because if it actually works, users could pull a subset of locally-addressable concerns out of a OneOf and then use normal rust pattern matching on the result. I still don't know if it will be possible but the idea of an API like this feels pretty compelling to me:

let one_of_3: OneOf<(Local1, Local2, Global)> = OneOf(Local2::default());

// assumes fn returns Result<_, OneOf<(Global,)>>
let local_concerns: OneOf<(Local1, Local2)> = one_of_3.narrow()?; 

match local_concerns.enum() {
    E2::A(Local1) => {},
    E2::B(Local2) => {},
}

@js2xxx
Copy link
Author

js2xxx commented Apr 1, 2024

I suspect the enum repr version would cause the narrowing and broadening operations to cost a lot.

Consider a specialized case of narrowing:

fn narrow<A, B, C, D, E>(repr: E5<A, B, C>) -> Result<C, E4<A, B, D, E>> {
    match repr {
        // 1. Matched.
        E3::C(c) => Ok(c),

        // 2. Unmatched, tag unchanged.
        E3::A(a) => Err(E2::A(a)),
        E3::B(b) => Err(E2::B(b)),

        // 3. Unmatched, tag changed.
        E3::D(d) => Err(E2::C(d)),
        E3::E(e) => Err(E2::D(e)),
    }
}

In this case, the pattern matching is forced to be expanded to 5 arms, while the conditions can be classified into 3 groups (commented above), which in the tagged union case can be reduced to 3 comparisons.

I don't know whether the compiler is smart enough to eliminate those redundant branches, but I think we cannot rely on that behavior since it lies in the scope of optimization.

By the way, the conditions can be further reduced to 2 groups with the Box<dyn Any> version, since the tag equals its type ID in that case, unnecessary to be changed, which seems to have the best performance.

EDIT: Here is my implementation of the enum version. It has fewer code lines but involves more macros and complex generic manipulations compared to the tagged union version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants