BFD-3700: Pipeline submits its own per-RIF and per-manifest metrics to CloudWatch #2514
+243
−4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
JIRA Ticket:
BFD-3700
What Does This PR Do?
This PR introduces new Micrometer-based "active" and "total" processing duration timers that submit per-RIF and per-manifest dimensioned time metrics:
CcwRifLoadJob.manifest_processing.total
CcwRifLoadJob.manifest_processing.active
CcwRifLoadJob.rif_file_processing.total
CcwRifLoadJob.rif_file_processing.active
Micrometer then creates sub-metrics (which are distinct metrics themselves in CloudWatch) for each. Of most particular interest for us is the following dimensioned Metrics:
CcwRifLoadJob.manifest_processing.total.max
CcwRifLoadJob.manifest_processing.active.duration
CcwRifLoadJob.rif_file_processing.total.max
CcwRifLoadJob.rif_file_processing.active.duration
The above metrics represent either the total time-to-load for a given RIF or manifest (
...max
metrics) or the ongoing load time of a given RIF or manifest (...duration
). These new metrics can be used to replace the equivalent metrics in our Dashboards.What Should Reviewers Watch For?
If you're reviewing this PR, please check for these things in particular:
What Security Implications Does This PR Have?
Please indicate if this PR does any of the following:
Adds any new software dependenciesModifies any security controlsAdds new transmission or storage of dataAny other changes that could possibly affect security?I have considered the above security implications as it relates to this PR. (If one or more of the above apply, it cannot be merged without the ISSO or team security engineer's (
@sb-benohe
) approval.)Validation
Have you fully verified and tested these changes? Is the acceptance criteria met? Please provide reproducible testing instructions, code snippets, or screenshots as applicable.
3700-test
up topipeline
, introducing a synthetic data set to the corresponding ETL Bucket, and running the BFD Pipeline verifying that:Thread.sleep()
) during each reporting interval (configured for 1 minute) for each RIF and manifest