Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add instrumentation on all async calls #1321

Open
ethanoroshiba opened this issue Jul 31, 2024 · 0 comments · May be fixed by #1368
Open

chore: add instrumentation on all async calls #1321

ethanoroshiba opened this issue Jul 31, 2024 · 0 comments · May be fixed by #1368
Assignees
Labels
observability observability, tracing, metrics

Comments

@ethanoroshiba
Copy link
Contributor

ethanoroshiba commented Jul 31, 2024

Mentioned in the Protocol Weekly on 7/25, instrumentation should be added on all async calls.

Things to keep in mind (via Janis):

  1. we need to remove instrumentation from all long-running tasks (in our case, essentially everything that is called T::run or T::run_until_stopped
  2. we also need to remove all events that run under these (so if you have a loop { select!((....)} construct, then there cannot be events in these
    a. The reason is that forever spans are completely useless on an observability platforms; in the best case we will see them at the end of a run, in the worst case we will lose them completely because the gRPC payload is too big (because the span contains too many events).
  3. while we want to have #[instrument(skip_all, err)], we don't necessarily want the errors to be emitted under an ERROR target. ERROR is for service-stopping issues, while WARN is for issues that can be handled (for example, through a retry).
    a. using this example, this would be #[instrument(err(level = Level::WARN)]
### Tasks
- [x] bridge-withdrawer
- [x] composer
- [x] conductor
- [ ] sequencer
- [x] sequencer-relayer

┆Issue Number: ENG-670

@ethanoroshiba ethanoroshiba self-assigned this Jul 31, 2024
@ethanoroshiba ethanoroshiba added the observability observability, tracing, metrics label Jul 31, 2024
ethanoroshiba added a commit that referenced this issue Aug 28, 2024
…1361)

## Summary
Shortened and streamlined geth and executor `run_until_stopped()`
functions to get rid of clippy exceptions and ensure events are emitted
within spans.

## Background
#1326 removed instrumentation on these `run_until_stopped()` functions,
which revealed a clippy warning for too many lines. Additionally, this
made it so that logging inside these functions would not be emitted
within any span.

## Changes
- Delegated many tasks to helper functions with instrumentation.
- Created new `utils` module to house a shared `report_exit_reason()`
function.
- Moved `ensure_chain_id_is_correct()` and `get_latest_nonce()` to
`init()` (previosuly `pre_run_checks()`)

## Testing
Passing all tests.


## Related Issues
Part of #1321

---------

Co-authored-by: Fraser Hutchison <[email protected]>
Co-authored-by: Richard Janis Goldschmidt <[email protected]>
github-merge-queue bot pushed a commit that referenced this issue Aug 28, 2024
## Summary
Added instrumentation to `astria-composer`

## Background
Adding instrumentation to all async calls will aid in tracing since
spans will be emitted even if no events happen under them.

## Changes
- Added instrumentation to all async function calls that are not
long-lived.
- Removed instrumentation from long-running functions.

## Related Issues
Part of #1321

---------

Co-authored-by: Fraser Hutchison <[email protected]>
Co-authored-by: Richard Janis Goldschmidt <[email protected]>
github-merge-queue bot pushed a commit that referenced this issue Aug 30, 2024
## Summary
Added instrumentation to `astria-conductor`

## Background
Adding instrumentation to all async calls will aid in tracing since
spans will be emitted even if no events happen under them.

## Changes
- Added instrumentation to all async function calls that are not
long-lived.
- Added instrumentation on some non-async functions that utilize logging
such that logging occurs within a span.
- Removed instrumentation on `run_until_stopped()` functions.
- Minor refactoring of `run_until_stopped()` functions to ensure logging
occurs within spans.

## Related Issues
Part of #1321

---------

Co-authored-by: Fraser Hutchison <[email protected]>
github-merge-queue bot pushed a commit that referenced this issue Aug 30, 2024
## Summary
Added instrumentation to `bridge-withdrawer`

## Background
Adding instrumentation to all async calls will aid in tracing since
spans will be emitted even if no events happen under them.

## Changes
- Added instrumentation to all async function calls that are not
long-lived.
- Minor refactor of `BridgeWithdrawer::run()` to avoid logging outside
of spans.
- Removed extraneous error logging from functions with
`#[instrument(..., err)]`.

## Related Issues
Part of #1321

---------

Co-authored-by: Fraser Hutchison <[email protected]>
jbowen93 pushed a commit that referenced this issue Sep 3, 2024
## Summary
Added instrumentation to `astria-conductor`

## Background
Adding instrumentation to all async calls will aid in tracing since
spans will be emitted even if no events happen under them.

## Changes
- Added instrumentation to all async function calls that are not
long-lived.
- Added instrumentation on some non-async functions that utilize logging
such that logging occurs within a span.
- Removed instrumentation on `run_until_stopped()` functions.
- Minor refactoring of `run_until_stopped()` functions to ensure logging
occurs within spans.

## Related Issues
Part of #1321

---------

Co-authored-by: Fraser Hutchison <[email protected]>
jbowen93 pushed a commit that referenced this issue Sep 3, 2024
## Summary
Added instrumentation to `bridge-withdrawer`

## Background
Adding instrumentation to all async calls will aid in tracing since
spans will be emitted even if no events happen under them.

## Changes
- Added instrumentation to all async function calls that are not
long-lived.
- Minor refactor of `BridgeWithdrawer::run()` to avoid logging outside
of spans.
- Removed extraneous error logging from functions with
`#[instrument(..., err)]`.

## Related Issues
Part of #1321

---------

Co-authored-by: Fraser Hutchison <[email protected]>
jbowen93 pushed a commit that referenced this issue Sep 6, 2024
## Summary
Added instrumentation to `astria-conductor`

## Background
Adding instrumentation to all async calls will aid in tracing since
spans will be emitted even if no events happen under them.

## Changes
- Added instrumentation to all async function calls that are not
long-lived.
- Added instrumentation on some non-async functions that utilize logging
such that logging occurs within a span.
- Removed instrumentation on `run_until_stopped()` functions.
- Minor refactoring of `run_until_stopped()` functions to ensure logging
occurs within spans.

## Related Issues
Part of #1321

---------

Co-authored-by: Fraser Hutchison <[email protected]>
jbowen93 pushed a commit that referenced this issue Sep 6, 2024
## Summary
Added instrumentation to `bridge-withdrawer`

## Background
Adding instrumentation to all async calls will aid in tracing since
spans will be emitted even if no events happen under them.

## Changes
- Added instrumentation to all async function calls that are not
long-lived.
- Minor refactor of `BridgeWithdrawer::run()` to avoid logging outside
of spans.
- Removed extraneous error logging from functions with
`#[instrument(..., err)]`.

## Related Issues
Part of #1321

---------

Co-authored-by: Fraser Hutchison <[email protected]>
github-merge-queue bot pushed a commit that referenced this issue Sep 6, 2024
## Summary
Added instrumentation to `astria-conductor`

## Background
Adding instrumentation to all async calls will aid in tracing since
spans will be emitted even if no events happen under them.

## Changes
- Added instrumentation to all async function calls that are not
long-lived.
- Removed instrumentation on long-lived functions.
- Minor refactoring of `run()` functions to ensure logging occurs within
spans.

## Related Issues
Part of #1321

---------

Co-authored-by: Fraser Hutchison <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
observability observability, tracing, metrics
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant