Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve alert content and adjust their routes #11911

Open
vkuznet opened this issue Feb 23, 2024 · 2 comments · May be fixed by #12237
Open

Improve alert content and adjust their routes #11911

vkuznet opened this issue Feb 23, 2024 · 2 comments · May be fixed by #12237

Comments

@vkuznet
Copy link
Contributor

vkuznet commented Feb 23, 2024

Impact of the new feature
Simplify debugging process during shift operations.

Is your feature request related to a problem? Please describe.
On MM I got a message about failure of transfer in CouchDB. There are two issues with it:

Describe the solution you'd like
I suggest few possible improvements:

  • change hard-coded emails to Mattermost channel, and/or e-group
  • improve documentation of WMCore alerts to include location of alerts embedded in the WMCore codebase
  • fix logging message to appear identically both in alerts and logs

Describe alternatives you've considered
Leave as is and struggle with debugging.

Additional context
WMCore documentation about alerts and AlertManager configuration:

@anpicci anpicci changed the title Improve alerts content and adjust their routes Improve alert content and adjust their routes Jan 19, 2025
@vkuznet vkuznet self-assigned this Jan 21, 2025
@mapellidario
Copy link
Member

I have a couple of suggestions that relate to the issue title, less with the issue description :)

  • include the cmsweb instance where the pod is runnning into the alert content, so that there is no need to crosscheck the podname with the output of kubectl get pods on all the clusters. one option could be to use the content of BASE_URL from the microservice configuration, used for example in data.reqmgr2Url.
  • add a config switch to turn off all notifications from a microservice, for all microservices and all alerts, see for example how it is used for msoutput, code and config

@vkuznet
Copy link
Contributor Author

vkuznet commented Jan 22, 2025

@mapellidario , thanks for suggestions, even though I think it will be useful they are not free and will require additional changes, in particular:

  • to extract python configuration we need to adjust AlertManagerAPI object to accept it in its constructur, this by itself will
  • add new dependency on AlertManagerAPI to always come with WM configuration object which I think will be a mistake since current code can be used without requiring such configuration, and therefore it will make AlertManagerAPI dependent on WMCore.Configuration

The config switch you are talking is present in WM code and proposed changes are fully encapsulated within AlertManagerAPI object. Therefore there is no need to add it inside of it since the upstream code (MS and others) will handle configuration properly.

That's said, I'm not against adding this, but I rather want to hear opinion of @amaltaro and @todor-ivanov about this suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ToDo
Development

Successfully merging a pull request may close this issue.

2 participants