Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Support for Dynamic Merging or Patching of Routing Tree Subtrees #4151

Open
consooo opened this issue Dec 5, 2024 · 1 comment

Comments

@consooo
Copy link

consooo commented Dec 5, 2024

Currently, Alertmanager's routing tree is defined statically in the alertmanager.yml configuration file. While this approach works well for environments where configurations are managed centrally, it presents challenges in dynamic or multi-tenant environments, where multiple teams or systems need to share the same Alertmanager instance.

Problem Statement

Managing notifications through Alertmanager routing trees in dynamic or multi-tenant setups is difficult because:

  1. Lack of Modularity: The routing tree is treated as a single, static object. It is not possible to dynamically merge or patch a subtree into the existing routing tree without regenerating the entire configuration.
  2. Scalability: In shared Alertmanager instances, each team may have specific routing requirements. Currently, these must be handled in a centralized, monolithic configuration file, which can become unwieldy and error-prone as the number of teams or requirements grows.
  3. Lack of Dynamic Flexibility: Dynamic environments, such as Kubernetes clusters, often require updates to routing configurations based on events (e.g., new clusters or services). This is difficult to achieve without external tooling to regenerate and reload the configuration.

Use Case

In a shared Grafana + Alertmanager setup, different teams want to manage their own notification routing policies independently, while still using a central Alertmanager instance. For example:

  • Team A wants to route all alerts with label team="A" to their Slack channel.
  • Team B wants to route alerts with label team="B" to their PagerDuty service.

Today, this would require a centralized admin to maintain and update the monolithic configuration file, or for each team to run its own instance of Alertmanager.

Proposed Solution

Introduce support for dynamically merging or patching subtrees into the routing tree. This could be achieved through:

  1. Dynamic Subtree Injection: Allow teams or systems to submit their subtree configuration (e.g., via an API or separate file) to be merged into the main routing tree at a specified location.
  2. Granular Reloading: Instead of requiring a full configuration reload, allow for partial reloads where only the modified subtree is reloaded.
  3. Modular Configuration: Support breaking the routing tree into modular files or objects that can be updated independently and then aggregated by Alertmanager.

I hope this makes sense to you. We encountered this problem all the way downstream in crossplane-provider-grafana. Here is also the related grafana issue.

@eyazici90
Copy link

This is something we would benefit a lot in our own use cases too. I am curious what maintainers think as well :)
I definitely might help for drafting a PR if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants