Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization for slow AsyncGauge execution #31

Merged
merged 3 commits into from
Mar 14, 2024

Conversation

gaojieliu
Copy link
Collaborator

This PR introduces a dynamic way to track slow AsyncGauge metric execution and tries not to block the caller thread as much as possible. In the high-level, this PR introduces a AsyncGaugeExecutor, which implements the following strategy:

  1. There are two executors and one for regular metrics and the other one is for slow metrics.

  2. All the metric evaluations are triggered by the caller.

  3. If the actual metric execution time exceeds the configured slow metric threshold, it will be moved to slow metric tracking map, which indicates the following behaviors: a. The next metric measurement call will return the cached value immediately. b. The submitted measurable will be executed asynchronously. c. If the actual measurement runtime latency becomes lower than the slow metric threshold, it will be moved out of slow metric tracking map.

  4. If the actual metric execution time belows the configured slow metric threshold, the following behaviors will be observed: a. After submitting the measurable to the regular executor, it will wait up to configured {@link AsyncGaugeExecutor#initialMetricsMeasurementTimeoutInMs} to collect the latest result. b. If it can't collect the latest value in step #a, the next call will examine the previous execution to decide whether it should be put into the slow metric tracking map or not.

  5. There is an async thread to clean up inactive metrics from slow metric tracking map to avoid the accumulation of garbage because of metric deletion.

There are several config params of AsyncGaugeExecutor and the user can tune it according to the actual load pattern, and the caller can construct a global AsyncGaugeExecutor and pass it to MetricsRepository via MetricConfig.

This PR introduces a dynamic way to track slow AsyncGauge metric
execution and tries not to block the caller thread as much as possible.
In the high-level, this PR introduces a `AsyncGaugeExecutor`, which
implements the following strategy:
 1. There are two executors and one for regular metrics and the other one is for slow metrics.
 2. All the metric evaluations are triggered by the caller.
 3. If the actual metric execution time exceeds the configured slow metric threshold, it will be moved to slow metric tracking map,
    which indicates the following behaviors:
    a. The next metric measurement call will return the cached value immediately.
    b. The submitted measurable will be executed asynchronously.
    c. If the actual measurement runtime latency becomes lower than the slow metric threshold, it will be moved out
       of slow metric tracking map.
 4. If the actual metric execution time belows the configured slow metric threshold, the following behaviors will be observed:
    a. After submitting the measurable to the regular executor, it will wait up to configured {@link AsyncGaugeExecutor#initialMetricsMeasurementTimeoutInMs}
       to collect the latest result.
    b. If it can't collect the latest value in step #a, the next call will examine the previous execution to decide
       whether it should be put into the slow metric tracking map or not.

 5. There is an async thread to clean up inactive metrics from slow metric tracking map to avoid the accumulation of garbage because of metric deletion.

There are several config params of `AsyncGaugeExecutor` and the user
can tune it according to the actual load pattern, and the caller
can construct a global `AsyncGaugeExecutor` and pass it to `MetricsRepository`
via `MetricConfig`.
Copy link
Collaborator

@huangminchn huangminchn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Gaojie! Looks good overall; left some comments.

Copy link
Collaborator

@huangminchn huangminchn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot Gaojie!

@gaojieliu gaojieliu merged commit a50499f into tehuti-io:master Mar 14, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants