Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Greatly reduce amount of generated code #1

Merged
merged 8 commits into from
Sep 6, 2024
Merged

Conversation

arthurprs
Copy link
Collaborator

@arthurprs arthurprs commented Aug 22, 2024

Why private-fork an unmaintained crate

  • Even if serde_cbor is unmaintained, it's stable. We have almost all ditto serialization to attest to that. Even at the time of archival, there weren't any real bugs open other than Option<()> handling.
  • Switching to https://github.com/enarx/ciborium or https://github.com/quininer/cbor4ii is unlikely to yield any net benefits other than labeling our dependency as maintained. Furthermore, there's no guarantee it's compatible as there's more than one way to model serde on top of cbor.
  • This fork focuses on reducing generated code size and is opinionated; porting this to someone else's crate is equivalent to a rewrite and is unlikely to be well received and/or merged quickly.

Changes

  • A fuzz test ensures* that the fork and the original serde_cbor are compatible at the lower level. At the higher level (structs/enums/..), compatibility is ensured by passing all the same tests and also not touching the code.
  • Add traits and two different implementations for configuring the encoder/encoder. The standard (static) configuration produces very little code as it avoids generating multiple branches in the encoder/decoder for each runtime configuration.
  • Generate only the strictly required decoder code for each deserialized data type. So if a u64 is to be deserialized, only generate code to deserialize integers and fail for other core entities. This is done via the type system with traits and associated constants. Note that the decoder is the largest source of code bloat.
  • Tidy up the code generation by reusing code from ucbor, which is highly optimized for speed and size.
  • All non-configurable behavior was retained to allow fuzzing the new implementation against the original serde_cbor. This includes the nuances of modeling structs/newtypes/enums atop of the serde data model.
  • Bugs were also retained. Specifically, disallowing packed encoding will also disallow any maps with integer keys, which isn't strictly correct. Ditto doesn't use packed encoding, so this doesn't affect our usage. I left it as it reduces the number of changes needed to make tests and fuzzer compatible.

Breaking changes

  • There's a second generic type argument on the (De)serializer types.
  • The standard (non-configured) decoder disallows legacy enums by default.
  • The error messages are slightly different

Theoretical breaking change

  • Breaks Deserialize implementations that tell the deserializer they want something but accept something completely different. All reasonable compatibility {str/bytes/seq}, {unsigned/signed/float} compatibilities were implemented to give an extra margin of safety.
  • Example: if the deserialize implementation calls deserialize_str(visitor) but would produce a value if the visitor visit_u64(..) was called. That is incompatible with this fork.

@arthurprs arthurprs changed the title Put codegen on a diet Greatly reduce amount of generated code Aug 22, 2024
tests/enum.rs Outdated Show resolved Hide resolved
src/ser.rs Outdated Show resolved Hide resolved

#[allow(missing_docs)]
#[inline]
fn to_custom(&self) -> CustomSerializerOptions {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this not be a impl<T: SerializerOptions> From<T> for CustomerSerializerOptions {} ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and same for the DeserializerOptions?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a conflict with the default implementation impl<T> From<T> for T

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries then. Oh boy the day Rust gets partial specialization I will be very happy

src/ser.rs Outdated Show resolved Hide resolved
Copy link

@p-avital p-avital left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new generics, though some might actually lead to more code getting generated in some points (if code factorization fails, which it shouldn't for such simple cases).

Have you measured the impact on binary size? (not that I feel the need for that to approve, just curious :) )

@arthurprs
Copy link
Collaborator Author

@p-avital

cargo_bloat measurements courtesy of @frankie

Before

 File  .text     Size Crate
 9.1%  15.4%   4.0MiB std
 6.4%  10.8%   2.8MiB serde_cbor
 3.4%   5.8%   1.5MiB ditto_replication
...
14.8%  25.1%   6.5MiB And 251 more crates. Use -n N to show more.
59.1% 100.0%  26.1MiB .text section size, the file size is 44.2MiB

After

 File  .text     Size Crate
 9.7%  16.8%   4.0MiB std
 3.7%   6.4%   1.5MiB ditto_replication
 3.5%   6.1%   1.5MiB [Unknown]
 3.1%   5.3%   1.3MiB tokio
 2.2%   3.7% 903.3KiB zvariant
 2.1%   3.6% 884.6KiB ditto_mesh
 1.9%   3.3% 809.2KiB tracing
 1.8%   3.1% 752.2KiB serde_cbor
...
15.8%  27.3%   6.5MiB And 251 more crates. Use -n N to show more.
57.9% 100.0%  23.7MiB .text section size, the file size is 41.0MiB

src/ser.rs Outdated Show resolved Hide resolved
src/ser.rs Outdated Show resolved Hide resolved
src/ser.rs Outdated Show resolved Hide resolved
src/ser.rs Outdated Show resolved Hide resolved
@arthurprs
Copy link
Collaborator Author

As promised, performance evaluation with rust_serialization_benchmark. Mostly positive but one significant (16%) regression deserializing the mesh payload. I'll see if it's something obvious, but otherwise, it's good to merge.

➜  rust_serialization_benchmark git:(f83a991) ✗ cargo bench --features serde,serde_cbor --no-default-features
warning: profiles for the non root package will be ignored, specify profiles at the workspace root:
package:   /home/arthurprs/dev/rust_serialization_benchmark/pages/Cargo.toml
workspace: /home/arthurprs/dev/rust_serialization_benchmark/Cargo.toml
   Compiling rust_serialization_benchmark v0.1.0 (/home/arthurprs/dev/rust_serialization_benchmark)
    Finished `bench` profile [optimized] target(s) in 24.82s
     Running unittests src/lib.rs (target/release/deps/rust_serialization_benchmark-9307c06afa3cf00f)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/bench.rs (target/release/deps/bench-63f4b759b80789ad)
Gnuplot not found, using plotters backend
Benchmarking log/serde_cbor/serialize: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.6s, enable flat sampling, or reduce sample count to 60.
log/serde_cbor/serialize
                        time:   [1.1157 ms 1.1197 ms 1.1238 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
log/serde_cbor/deserialize
                        time:   [3.0322 ms 3.0401 ms 3.0481 ms]
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) low mild
  1 (1.00%) high severe
log/serde_cbor/size 1407835
log/serde_cbor/zlib 403440
log/serde_cbor/zstd 324081
log/serde_cbor/zstd_time time:   [3.6423 ms 3.6423 ms 3.6423 ms] 88 MB/s

mesh/serde_cbor/serialize
                        time:   [20.796 ms 20.863 ms 20.933 ms]
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low mild
  7 (7.00%) high mild
mesh/serde_cbor/deserialize
                        time:   [25.988 ms 26.045 ms 26.102 ms]
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
mesh/serde_cbor/size 13122324
mesh/serde_cbor/zlib 7524660
mesh/serde_cbor/zstd 6759658
mesh/serde_cbor/zstd_time time:   [54.5447 ms 54.5447 ms 54.5447 ms] 123 MB/s

Benchmarking minecraft_savedata/serde_cbor/serialize: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.8s, enable flat sampling, or reduce sample count to 60.
minecraft_savedata/serde_cbor/serialize
                        time:   [1.1425 ms 1.1446 ms 1.1467 ms]
minecraft_savedata/serde_cbor/deserialize
                        time:   [3.1182 ms 3.1299 ms 3.1452 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
minecraft_savedata/serde_cbor/size 1109821
minecraft_savedata/serde_cbor/zlib 344751
minecraft_savedata/serde_cbor/zstd 274526
minecraft_savedata/serde_cbor/zstd_time time:   [4.4027 ms 4.4027 ms 4.4027 ms] 62 MB/s

mk48/serde_cbor/serialize
                        time:   [5.6154 ms 5.6582 ms 5.7030 ms]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild
mk48/serde_cbor/deserialize
                        time:   [12.754 ms 12.790 ms 12.828 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
mk48/serde_cbor/size 5878653
mk48/serde_cbor/zlib 1655791
mk48/serde_cbor/zstd 1431560
mk48/serde_cbor/zstd_time time:   [19.9277 ms 19.9277 ms 19.9277 ms] 71 MB/s

➜  rust_serialization_benchmark git:(f83a991) ✗ cargo bench --features serde,serde_cbor --no-default-features
warning: profiles for the non root package will be ignored, specify profiles at the workspace root:
package:   /home/arthurprs/dev/rust_serialization_benchmark/pages/Cargo.toml
workspace: /home/arthurprs/dev/rust_serialization_benchmark/Cargo.toml
   Compiling serde_cbor v0.11.2 (https://github.com/arthurprs/cbor.git?branch=codegen-diet#52aaa517)
   Compiling rust_serialization_benchmark v0.1.0 (/home/arthurprs/dev/rust_serialization_benchmark)
    Finished `bench` profile [optimized] target(s) in 19.51s
     Running unittests src/lib.rs (target/release/deps/rust_serialization_benchmark-c8f9ad7e50d975fe)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/bench.rs (target/release/deps/bench-86561f09432356c9)
Gnuplot not found, using plotters backend
log/serde_cbor/serialize
                        time:   [918.55 µs 920.34 µs 922.16 µs]
                        change: [-17.746% -17.507% -17.277%] (p = 0.00 < 0.05)
                        Performance has improved.
log/serde_cbor/deserialize
                        time:   [3.2433 ms 3.2486 ms 3.2538 ms]
                        change: [+6.5394% +6.8559% +7.1881%] (p = 0.00 < 0.05)
                        Performance has regressed.
log/serde_cbor/size 1407835
log/serde_cbor/zlib 403440
log/serde_cbor/zstd 324081
log/serde_cbor/zstd_time time:   [3.9189 ms 3.9189 ms 3.9189 ms] 82 MB/s

mesh/serde_cbor/serialize
                        time:   [18.359 ms 18.403 ms 18.447 ms]
                        change: [-12.146% -11.794% -11.425%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
mesh/serde_cbor/deserialize
                        time:   [30.970 ms 31.049 ms 31.129 ms]
                        change: [+18.803% +19.215% +19.626%] (p = 0.00 < 0.05)
                        Performance has regressed.
mesh/serde_cbor/size 13122324
mesh/serde_cbor/zlib 7524660
mesh/serde_cbor/zstd 6759658
mesh/serde_cbor/zstd_time time:   [59.4914 ms 59.4914 ms 59.4914 ms] 113 MB/s

Benchmarking minecraft_savedata/serde_cbor/serialize: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.1s, enable flat sampling, or reduce sample count to 60.
minecraft_savedata/serde_cbor/serialize
                        time:   [1.0160 ms 1.0175 ms 1.0192 ms]
                        change: [-11.479% -11.279% -11.066%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
minecraft_savedata/serde_cbor/deserialize
                        time:   [2.9928 ms 2.9961 ms 2.9995 ms]
                        change: [-4.7591% -4.2750% -3.9060%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
minecraft_savedata/serde_cbor/size 1109821
minecraft_savedata/serde_cbor/zlib 344751
minecraft_savedata/serde_cbor/zstd 274526
minecraft_savedata/serde_cbor/zstd_time time:   [2.6698 ms 2.6698 ms 2.6698 ms] 102 MB/s

mk48/serde_cbor/serialize
                        time:   [5.1101 ms 5.1200 ms 5.1302 ms]
                        change: [-10.257% -9.5107% -8.8062%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
mk48/serde_cbor/deserialize
                        time:   [11.782 ms 11.805 ms 11.827 ms]
                        change: [-8.0137% -7.7054% -7.3752%] (p = 0.00 < 0.05)
                        Performance has improved.
mk48/serde_cbor/size 5878653
mk48/serde_cbor/zlib 1655791
mk48/serde_cbor/zstd 1431560
mk48/serde_cbor/zstd_time time:   [21.2273 ms 21.2273 ms 21.2273 ms] 67 MB/s

@arthurprs
Copy link
Collaborator Author

Fixed the big regression in 1b274f7 and further improved the rest.

➜  rust_serialization_benchmark git:(f83a991) ✗ cargo bench --features serde,serde_cbor --no-default-features
warning: profiles for the non root package will be ignored, specify profiles at the workspace root:
package:   /home/arthurprs/dev/rust_serialization_benchmark/pages/Cargo.toml
workspace: /home/arthurprs/dev/rust_serialization_benchmark/Cargo.toml
   Compiling rust_serialization_benchmark v0.1.0 (/home/arthurprs/dev/rust_serialization_benchmark)
    Finished `bench` profile [optimized] target(s) in 17.86s
     Running unittests src/lib.rs (target/release/deps/rust_serialization_benchmark-c437099c7e7e0b4c)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/bench.rs (target/release/deps/bench-c6c941fb3b93f41d)
Gnuplot not found, using plotters backend
log/serde_cbor/serialize
                        time:   [903.14 µs 906.31 µs 909.49 µs]
                        change: [-18.865% -18.607% -18.357%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low mild
  6 (6.00%) high mild
  2 (2.00%) high severe
log/serde_cbor/deserialize
                        time:   [3.1774 ms 3.1868 ms 3.1965 ms]
                        change: [+4.5207% +4.9002% +5.2736%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe
log/serde_cbor/size 1407835
log/serde_cbor/zlib 403440
log/serde_cbor/zstd 324081
log/serde_cbor/zstd_time time:   [3.2489 ms 3.2489 ms 3.2489 ms] 99 MB/s

mesh/serde_cbor/serialize
                        time:   [19.244 ms 19.308 ms 19.371 ms]
                        change: [-7.0922% -6.7476% -6.4059%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild
mesh/serde_cbor/deserialize
                        time:   [26.558 ms 26.605 ms 26.652 ms]
                        change: [+2.1291% +2.6363% +3.0838%] (p = 0.00 < 0.05)
                        Performance has regressed.
mesh/serde_cbor/size 13122324
mesh/serde_cbor/zlib 7524660
mesh/serde_cbor/zstd 6759658
mesh/serde_cbor/zstd_time time:   [56.2815 ms 56.2815 ms 56.2815 ms] 120 MB/s

minecraft_savedata/serde_cbor/serialize
                        time:   [988.00 µs 991.35 µs 994.46 µs]
                        change: [-13.675% -13.346% -12.959%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
minecraft_savedata/serde_cbor/deserialize
                        time:   [2.8819 ms 2.8855 ms 2.8889 ms]
                        change: [-6.9540% -6.2011% -5.7209%] (p = 0.00 < 0.05)
                        Performance has improved.
minecraft_savedata/serde_cbor/size 1109821
minecraft_savedata/serde_cbor/zlib 344751
minecraft_savedata/serde_cbor/zstd 274526
minecraft_savedata/serde_cbor/zstd_time time:   [2.4298 ms 2.4298 ms 2.4298 ms] 112 MB/s

mk48/serde_cbor/serialize
                        time:   [5.0094 ms 5.0252 ms 5.0424 ms]
                        change: [-7.2534% -6.8646% -6.5095%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe
mk48/serde_cbor/deserialize
                        time:   [11.401 ms 11.419 ms 11.439 ms]
                        change: [-10.101% -9.8655% -9.6223%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
mk48/serde_cbor/size 5878653
mk48/serde_cbor/zlib 1655791
mk48/serde_cbor/zstd 1431560
mk48/serde_cbor/zstd_time time:   [15.1785 ms 15.1785 ms 15.1785 ms] 94 MB/s

@arthurprs arthurprs merged commit e8ce605 into master Sep 6, 2024
3 checks passed
@arthurprs arthurprs deleted the codegen-diet branch September 6, 2024 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants