Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more context/information in alert notifications #15

Open
woodsaj opened this issue Apr 4, 2016 · 5 comments
Open

more context/information in alert notifications #15

woodsaj opened this issue Apr 4, 2016 · 5 comments

Comments

@woodsaj
Copy link
Contributor

woodsaj commented Apr 4, 2016

Issue by Dieterbe
Friday Jul 24, 2015 at 21:00 GMT
Originally opened as raintank/grafana#370


current litmus alerts are quite plain. they convey time, endpoint and state.
can we add a lot of value within a small amount of work? a worthy thought experiment, especially as it also helps with standard grafana.

  • perhaps include a png rendered image of the monitor over time? perhaps more than one (let's say a 24h one, a 6h one and a 1h one)
  • can we include a link to a snapshot? this doesn't seem super valuable though. we might as well just link to the live dashboard, optionally set for a certain timerange.
  • can the email itself be a dashboard snapshot? that would be pretty sweet. having flot with zoom, datapoint inspection etc right in your mail box. main problem is that AFAIK javascript is disabled in email clients.
@woodsaj
Copy link
Contributor Author

woodsaj commented Apr 4, 2016

Comment by nopzor1200
Friday Jul 24, 2015 at 21:41 GMT


These are all really good ideas. I'd love to do some of this stuff. I guess I'd like to know how possible some of this stuff is, right now, with our current "batteries"? Could use your guidance in knowing what's more practical at this point.

This is related to #125 can you help making these variables available? Currently blocked. It's a more actionable item that could use your help. It would allow us to show the time (important) as well as create a link to the dashboard (as you suggest).

PNG rendered image of monitor over time included in the alert email seems killer. I like the idea of different timeframes.

For Litmus I like the idea of being more specific. One of the biggest wins would be to detail the status of individual collectors in the alert query over time. For example, great I know nopzair.com is down from "more than 2 locations". But which 2 locations is it supposedly down for? For how long?

@woodsaj
Copy link
Contributor Author

woodsaj commented Apr 4, 2016

Comment by Dieterbe
Monday Aug 03, 2015 at 08:36 GMT


But which 2 locations is it supposedly down for?

this might be fairly easy via bosun expression.

For how long?

for how long what? is it down from each location? if you're getting a down notification than it most often means it just went down, although there could be some cases where it may have been down from one or two collectors for a while longer. retrieving this information is non-trivial though, you'd basically have to do extra queries and travel far enough back in time until the answer is in the data. probably better to just include a graph of the last X hours.

@torkelo @woodsaj do you know if it's possible to make a snapshot-like feature, but instead of rendering a snapshot that is html+js, render to plain html only (because email clients tend to block js).
we could then generate the html for emails like that. or maybe this is too much work. the alternative would be to write some code that spews out the html email and does the required "render as png" calls itself.

@woodsaj
Copy link
Contributor Author

woodsaj commented Apr 4, 2016

Comment by woodsaj
Monday Aug 03, 2015 at 13:22 GMT


No, you cant do html only. The graphing frontend is all javascript.

I find email to be a pretty limited medium for providing this type of data and would personally rather just get a link to a "Fault" dashboard that shows the details of the alert and relevant graphs that i can interact with. Just sending a link works equally well with SMS as it does with email which many people still use.

It might be an idea to generate this 'fault' dashboard and push it to snapshot.raintank for persistence. All the alert details could be rendered in a text panel, and relevant graphs included.

@woodsaj
Copy link
Contributor Author

woodsaj commented Apr 4, 2016

Comment by Dieterbe
Thursday Aug 27, 2015 at 11:15 GMT


The graphing frontend is all javascript.

could we not replace those with png's through the "render as png" feature?

would personally rather just get a link to a "Fault" dashboard that shows the details of the alert and relevant graphs that i can interact with.

I like this idea. and it's probably better. because even if we extended the notifications with more stuff, it would only overlap more with the dedicated full-featured "here's everything you need" dashboard that we would still need.

i know that a common #monitoringsucks theme is "we need more context in our alerts so that if we get a message in the middle of the night we know what's up", but I think it's fair for all that context to be 1 click away, especially if we can provide lots of insight/context.

however, i suspect most people will use a smartphone which has limited screen space, so that makes me still think there's a lot merit to providing some context that fits on a smartphone screen, and could come inside of the email. and then refer to the fault dashboard for more insights.

@woodsaj
Copy link
Contributor Author

woodsaj commented Apr 4, 2016

Comment by torkelo
Monday Aug 31, 2015 at 10:46 GMT


I think having a png in an alert email to be quite useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant