Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BFD-3700: Pipeline submits its own per-RIF and per-manifest metrics to CloudWatch #2514

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

malessi
Copy link
Contributor

@malessi malessi commented Dec 19, 2024

JIRA Ticket:
BFD-3700

What Does This PR Do?

This PR introduces new Micrometer-based "active" and "total" processing duration timers that submit per-RIF and per-manifest dimensioned time metrics:

  • CcwRifLoadJob.manifest_processing.total
  • CcwRifLoadJob.manifest_processing.active
  • CcwRifLoadJob.rif_file_processing.total
  • CcwRifLoadJob.rif_file_processing.active

Micrometer then creates sub-metrics (which are distinct metrics themselves in CloudWatch) for each. Of most particular interest for us is the following dimensioned Metrics:

  • CcwRifLoadJob.manifest_processing.total.max
  • CcwRifLoadJob.manifest_processing.active.duration
  • CcwRifLoadJob.rif_file_processing.total.max
  • CcwRifLoadJob.rif_file_processing.active.duration

The above metrics represent either the total time-to-load for a given RIF or manifest (...max metrics) or the ongoing load time of a given RIF or manifest (...duration). These new metrics can be used to replace the equivalent metrics in our Dashboards.

What Should Reviewers Watch For?

If you're reviewing this PR, please check for these things in particular:

What Security Implications Does This PR Have?

Please indicate if this PR does any of the following:

  • Adds any new software dependencies

  • Modifies any security controls

  • Adds new transmission or storage of data

  • Any other changes that could possibly affect security?

  • I have considered the above security implications as it relates to this PR. (If one or more of the above apply, it cannot be merged without the ISSO or team security engineer's (@sb-benohe) approval.)

Validation

Have you fully verified and tested these changes? Is the acceptance criteria met? Please provide reproducible testing instructions, code snippets, or screenshots as applicable.

  • Creating ephemeral environment 3700-test up to pipeline, introducing a synthetic data set to the corresponding ETL Bucket, and running the BFD Pipeline verifying that:
    • All expected metrics are submitted to CloudWatch Metrics
    • "Active" timers submit the ongoing duration of loading (long load times simulated using Thread.sleep()) during each reporting interval (configured for 1 minute) for each RIF and manifest
    • "Total" timers submit the final, total load time after each RIF and manifest is loaded

@malessi malessi marked this pull request as ready for review December 20, 2024 18:44
@malessi malessi enabled auto-merge (squash) January 13, 2025 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants