-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hashed utility for ValueMap to improve hashing performance #2296
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2296 +/- ##
======================================
Coverage 79.4% 79.4%
======================================
Files 122 123 +1
Lines 20783 20905 +122
======================================
+ Hits 16504 16619 +115
- Misses 4279 4286 +7 ☔ View full report in Codecov by Sentry. |
opentelemetry-sdk/Cargo.toml
Outdated
@@ -29,6 +29,7 @@ tokio = { workspace = true, features = ["rt", "time"], optional = true } | |||
tokio-stream = { workspace = true, optional = true } | |||
http = { workspace = true, optional = true } | |||
tracing = {workspace = true, optional = true} | |||
rustc-hash = "2.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably try other hash functions, but results that I posted is with this crate.
Generally, my idea was, that if it is good enough for rustc compiler (which operates on a lot of small strings), it should be perfect fit for attribute sets too.
But if you want, I might as well test with other crates.
I remember that @cijothomas pointed me to this comment:
https://github.com/open-telemetry/opentelemetry-rust/pull/1564/files#diff-efe82800c0ca89e5cdf4c14f0674f753b0148351f0bfad79f89b20d295c4e6e4R58
but I'm a bit afraid that attribute-sets in current benchmarks/stress-test is different from real world usage (using opentelemetry-semantic-conventions), so these tests might not be vary accurate as well...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, when running benchmarks to compare performance for metric updates, please run the benchmarks in metrics_counter.rs
instead of metric.rs
. We haven't really fine-tuned the benchmarks in metrics.rs
in a while. We might also remove if it doesn't help that much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fraillt This PR is trying to bring two changes at once.
Could you please split this into two different PRs?
- One PR which only has changes related to caching the computed hash value
- Another one which uses a different Hasher (could you try aHash?). Also, please note this would have to be feature flagged. We do not want to use a custom hasher by default.
2aa01ee
to
8e9d216
Compare
Updated code and PR description. |
Ran stress/benchmarks |
I see this will have huge value if we were to implement sharding ourselves, so caching the hash value will definitely help. Before proceeding further, I'd like to suggest to confirm that implementing sharding ourselves is the right direction. See #2288 (comment) |
Yes, benchmarks are basically within the noise, but it's technically an improvement, this revision avoids extra hashing in HashMap, because of Regarding stress they are so weird and sensitive on so many factors... on my current machine, I consistently observe opposite results, maybe ~10% improvement, on 1.81 rust version. but your machine is totally different, -20% performance degradation... wow :) |
Changes
Remove redundant hashing in
ValueMap
, by using newHashed
type. It's similar toAttributeSet
, but is generic, and support owned and non-owned types (similar toCow
type).On happy path, there's very low performance increase, but
Counter_Overflow
has improved significantly.The whole reason for this PR, is not 2% percent performance increase, but being able to reuse same hashed results when implementing sharding #2297.
cargo bench --bench metrics_counter
Merge requirement checklist
CHANGELOG.md
files updated for non-trivial, user-facing changes