Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize base64/hex decoding by pre-allocating output buffers (~2x faster) #12675

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

simonvandel
Copy link
Contributor

Which issue does this PR close?

Closes #.

Rationale for this change

It is generally faster to make a big allocation up front, rather than making many small allocations.

What changes are included in this PR?

  • Add a benchmark
  • Refactor to reduce code duplication
  • Change decoding methods to write into pre-allocated buffer

Are these changes tested?

Relying on existing SQL tests.

Are there any user-facing changes?

Yes, base64 and hex decoding is ~2x faster:

base64_decode/1024      time:   [26.362 µs 26.459 µs 26.597 µs]
                        change: [-48.501% -47.320% -45.897%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

hex_decode/1024         time:   [92.789 µs 92.904 µs 93.035 µs]
                        change: [-58.973% -58.895% -58.816%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

base64_decode/4096      time:   [100.23 µs 100.30 µs 100.38 µs]
                        change: [-62.678% -62.609% -62.545%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) low severe
  1 (1.00%) high mild
  1 (1.00%) high severe

hex_decode/4096         time:   [373.11 µs 377.63 µs 383.37 µs]
                        change: [-58.265% -58.081% -57.870%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  1 (1.00%) low severe
  5 (5.00%) high mild
  6 (6.00%) high severe

base64_decode/8192      time:   [205.00 µs 205.26 µs 205.62 µs]
                        change: [-62.173% -61.549% -60.938%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe

hex_decode/8192         time:   [735.37 µs 737.54 µs 740.51 µs]
                        change: [-59.434% -59.332% -59.214%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a very nice improvement to me @simonvandel

There is probably additional performance to be had by using unsafe, but this seems like an improvement over the current state to me. We can always optimize it further if/when necessary

where
F: Fn(&[u8], &mut [u8]) -> Result<usize>,
{
let mut values = vec![0; conservative_upper_bound_size];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could potentially call Vec::with_capacity rather than having to clear it all and then truncate at the end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think using with_capacity is possible here, as we need to be able to give mutable slices out, that the hex/base64 methods can decode into.
Using just with_capacity, the length of the vector would be zero, so we can't mutably slice it.

match self {
Self::Base64 => {
let upper_bound =
base64::decoded_len_estimate(input_value.values().len());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked and indeed the docs say this is a conservative estimate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants