-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize base64/hex decoding by pre-allocating output buffers (~2x faster) #12675
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a very nice improvement to me @simonvandel
There is probably additional performance to be had by using unsafe, but this seems like an improvement over the current state to me. We can always optimize it further if/when necessary
where | ||
F: Fn(&[u8], &mut [u8]) -> Result<usize>, | ||
{ | ||
let mut values = vec![0; conservative_upper_bound_size]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could potentially call Vec::with_capacity
rather than having to clear it all and then truncate at the end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think using with_capacity is possible here, as we need to be able to give mutable slices out, that the hex/base64 methods can decode into.
Using just with_capacity, the length of the vector would be zero, so we can't mutably slice it.
match self { | ||
Self::Base64 => { | ||
let upper_bound = | ||
base64::decoded_len_estimate(input_value.values().len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double checked and indeed the docs say this is a conservative estimate
Which issue does this PR close?
Closes #.
Rationale for this change
It is generally faster to make a big allocation up front, rather than making many small allocations.
What changes are included in this PR?
Are these changes tested?
Relying on existing SQL tests.
Are there any user-facing changes?
Yes, base64 and hex decoding is ~2x faster: