Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix few Observable Counter and UpDownCounter bugs #1992

Conversation

cijothomas
Copy link
Member

@cijothomas cijothomas commented Aug 7, 2024

Fixes #1517 and other bugs. Added tests.
Copying ideas from @stormshield-fabs 's original PR: #1644

Additionally, the previous implementation of the Observable relied on the assumption that the ValueMap is drained after ever collect. This is an area that could be tweaked in the future. This PR removed that dependency anyway, so this is better future proof.

Focused on fixing the bug only. The code is poorly structured now, and could benefit from a refactoring. The Aggregation could be made generic similar to what was done in #1644 or other ways to make it manageable. Currently, there are tons of code duplication. Will attempt this after ensuring all known bugs are fixed and there is good test coverage to confidently do major refactorings.

@cijothomas cijothomas requested a review from a team August 7, 2024 06:40
}
}
}

impl<T: Number<T>> ValueMap<T> {
fn measure(&self, measurement: T, attrs: &[KeyValue]) {
if attrs.is_empty() {
self.no_attribute_value.add(measurement);
if self.assign_only {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stormshield-fabs Could you help refactor this to use generics, so we dont have this duplication everywhere and get better perf?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's extract this repeated snippet to a method for now? It would also make it easier to review the future PR that would refactor it using generics.

@@ -629,8 +667,9 @@ mod tests {
}

#[tokio::test(flavor = "multi_thread", worker_threads = 1)]
#[ignore = "Spatial aggregation is not yet implemented."]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this got accidentally fixed in some other PR, but this PR breaks it. spatial aggregation is complex and needs a separate discussion altogether!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no issue to track this, probably good to have one.

Copy link

codecov bot commented Aug 7, 2024

Codecov Report

Attention: Patch coverage is 77.31959% with 22 lines in your changes missing coverage. Please review.

Project coverage is 74.8%. Comparing base (e905cf7) to head (150a086).

Files Patch % Lines
opentelemetry-sdk/src/metrics/internal/sum.rs 73.5% 14 Missing ⚠️
opentelemetry-sdk/src/metrics/internal/mod.rs 30.0% 7 Missing ⚠️
opentelemetry-sdk/src/metrics/mod.rs 97.0% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main   #1992     +/-   ##
=======================================
- Coverage   75.1%   74.8%   -0.3%     
=======================================
  Files        122     122             
  Lines      20642   20698     +56     
=======================================
- Hits       15505   15485     -20     
- Misses      5137    5213     +76     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

- Fixed various Metric aggregation bug related to
ObservableCounter,UpDownCounter including
[1517](https://github.com/open-telemetry/opentelemetry-rust/issues/1517).
[#1990](https://github.com/open-telemetry/opentelemetry-rust/pull/1990)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[#1990](https://github.com/open-telemetry/opentelemetry-rust/pull/1990)
[#1992](https://github.com/open-telemetry/opentelemetry-rust/pull/1992)

@@ -25,40 +25,54 @@ struct ValueMap<T: Number<T>> {
has_no_value_attribute_value: AtomicBool,
no_attribute_value: T::AtomicTracker,
count: AtomicUsize,
assign_only: bool, // if true, only assign incoming value, instead of adding to existing value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename this to something more intuitive like is_sum_precomputed? I think that would better explain the reasoning behind the logic when reading the code:

if self.is_sum_precomputed {
    value_to_update.store(measurement);
} else {
    value_to_update.add(measurement);
}

let default = T::default();
let delta = self.value_map.no_attribute_value.get_value()
- *reported.get(&vec![]).unwrap_or(&default);
new_reported.insert(vec![], self.value_map.no_attribute_value.get_value());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be storing the computed delta value here and not do another atomic read. Otherwise, we risk storing an incorrect value from another update thread.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. The self.value_map.no_attribute_value fetched once in line 372 should be used as new_reported value instead of fetching again.

@cijothomas
Copy link
Member Author

Closing in favor of #2004

@cijothomas cijothomas closed this Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ObservableCounter aggregation bug for cumulative
3 participants