-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding alarms for github arc runner failures #1242
Conversation
Updating alarms ⏰? Great! Please update the Google Sheet and add a 👍 to this message after 🙏 |
1 similar comment
Updating alarms ⏰? Great! Please update the Google Sheet and add a 👍 to this message after 🙏 |
Staging: eks✅ Terraform Init: Plan: 2 to add, 0 to change, 0 to destroy Show summary
Show planResource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# aws_cloudwatch_log_metric_filter.github-arc-write-alarm[0] will be created
+ resource "aws_cloudwatch_log_metric_filter" "github-arc-write-alarm" {
+ id = (known after apply)
+ log_group_name = "/aws/containerinsights/notification-canada-ca-staging-eks-cluster/application"
+ name = "GitHub ARC Runners Write Alarm"
+ pattern = "WRITE ERROR: An error occured:"
+ metric_transformation {
+ name = "aggregating-github-arc-write-alarm"
+ namespace = "LogMetrics"
+ unit = "None"
+ value = "1"
}
}
# aws_cloudwatch_metric_alarm.github-arc-runner-write-alarm[0] will be created
+ resource "aws_cloudwatch_metric_alarm" "github-arc-runner-write-alarm" {
+ actions_enabled = true
+ alarm_actions = [
+ "arn:aws:sns:ca-central-1:239043911459:alert-critical",
]
+ alarm_description = "GitHub ARC Runners Are Failing - Check Version Deprecation"
+ alarm_name = "github-arc-runner-write-alarm"
+ arn = (known after apply)
+ comparison_operator = "LessThanThreshold"
+ evaluate_low_sample_count_percentiles = (known after apply)
+ evaluation_periods = 1
+ id = (known after apply)
+ metric_name = "aggregating-github-arc-write-alarm"
+ namespace = "LogMetrics"
+ ok_actions = [
+ "arn:aws:sns:ca-central-1:239043911459:alert-critical",
]
+ period = 300
+ statistic = "Sum"
+ tags_all = (known after apply)
+ threshold = 1
+ treat_missing_data = "notBreaching"
}
Plan: 2 to add, 0 to change, 0 to destroy.
─────────────────────────────────────────────────────────────────────────────
Saved the plan to: plan.tfplan
To perform exactly these actions, run the following command to apply:
terraform apply "plan.tfplan"
Show Conftest resultsWARN - plan.json - main - Missing Common Tags: ["aws_acm_certificate.client_vpn"]
WARN - plan.json - main - Missing Common Tags: ["aws_acm_certificate.notification-canada-ca"]
WARN - plan.json - main - Missing Common Tags: ["aws_acm_certificate.notification-canada-ca-alt[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb.notification-canada-ca"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_listener.internal_alb_tls"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_listener.notification-canada-ca"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_target_group.internal_nginx_http"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_target_group.notification-canada-ca-admin"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_target_group.notification-canada-ca-api"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_target_group.notification-canada-ca-document"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_target_group.notification-canada-ca-document-api"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_target_group.notification-canada-ca-documentation"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_log_group.notification-canada-ca-eks-application-logs[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_log_group.notification-canada-ca-eks-cluster-logs[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_log_group.notification-canada-ca-eks-prometheus-logs[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.admin-evicted-pods[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.admin-pods-high-cpu-warning[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.admin-pods-high-memory-warning[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.admin-replicas-unavailable[0]"]
WARN - plan.json - main - Missing Common Tags:... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⏰
Summary | Résumé
This will trigger an alarm when github actions runners fail due to deprecated versions.
Related Issues | Cartes liées
Test instructions | Instructions pour tester la modification
Terraform apply works - if we want to test the alarm, we can break the github actions runners by going back to a deprecated version
Release Instructions | Instructions pour le déploiement
None.
Reviewer checklist | Liste de vérification du réviseur