-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Infrastructure] Add Cloud Monitoring #139
Comments
Fun question: Cloud Monitoring or managed Prometheus, or both??? |
Cloud Monitoring can handle all GCP resources and most of the standard GKE metrics are available in Cloud Monitoring. But for game specific/GKE workloads, Prometheus might be a better choice. The question then becomes: do you want to manage both? I would suggest a 2 phase approach: Get critical systems into Cloud Monitoring so that core systems are alerting on any issues. This is straight forward and all we need to determine is what we alert on. Then as Game monitoring requirements arise, we look at if they can work in Cloud Monitoring or if Prometheus is a better approach. My 2 cents. |
My thought was more - some would like Cloud Monitoring, some would like managed Prometheus. I've seen both in the wild. |
I'm not sure how helpful this is for you, but the gcloud command shown is a 'fully loaded' one that I often use. It has absolutely all the bells and whistles turned on for GKE in the monitoring, logging, resource monitoring (aka 'cost monitoring), and notifications areas, including turning on monitoring for google-managed k8s controlplane components. The names of the gcloud feature flags (and their corresponding values) are essentially a 1-1 mapping to the key/value pairs that the GKE terraform module uses, so hopefully this helps save some time stubbing out something here
|
Add GCP Cloud Monitoring to the project to alert on service availability.
Use Terraform to create the following:
Dashboard
Uptime checks for Endpoints
Service Availability - GKE, Redis, Spanner, Endpoints
Make monitoring optional - add enable true/false flag.
Variables alert notifications email address & place in terraform.tfvars.sample
Add additional monitoring checks as they are suggested.
The text was updated successfully, but these errors were encountered: