-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Do not emit health_status event for each health check attempt. #24005
base: main
Are you sure you want to change the base?
Fix: Do not emit health_status event for each health check attempt. #24005
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: harish2704 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Cockpit tests failed for commit 2771216d55d7fec0e86642e774fdc73f59d4a409. @martinpitt, @jelly, @mvollmer please check. |
This "breaks" cockpit-podman's testHealthcheckSystem, which assumes the current behavior of getting regular "health check ran and passed" events. I don't have a good gut feeling whether the regular "pings of life" are expected/by design or considered noise. So I'll defer to the review of the podman developers, and if this change is approved, I'll adjust cockpit-podman's tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a good gut feeling whether the regular "pings of life" are expected/by design or considered noise
I don't think there is a formal design on how it should work but the way it works today is how the current users will expect it to work. The fact that it breaks cockpit testing is a good sign that we have users depending on it. As such I consider this a breaking change that is not suitable for a minor version so this would have to wait for podman 6.0 if we want to do that at all. I think reducing the event spam is a good idea in general.
Now one thing we should consider if docker doesn't behave this way our docker compat api should not behave this way either. One way would be to a new field to the health_status event that is set when the status changed and then we can filter out the events that did not have this set to make the docker clients work correctly at least.
cc @mheon @Honny1 In case you have opinions as you have been working on other healthcheck events work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not, reduce the noise in the logs, but I would wait for Podman 6.0. I'd also be in favor of adding a flag to enable log "saving" for each run, for podman healtcheck run
(probably also for podman run
), for debugging purposes or if something goes wrong.
I have to agree on this because I searched for such a documentation ( https://docs.docker.com/reference/api/engine/version/v1.41/#tag/System/operation/SystemEvents ) and there is no description about exact behavior of those events . They have only provided the list of valid events
IMHO, Podman has a huge opportunity as secure docker replacement and in that sense most of the users ( and applications targeting docker eg: Traefik ) will expect it work similar to docker. To address the user base who is looking to switch to Podman, it should be considered as a bug. Please note that, here i am not referring the
Exactly. This is the point I was trying to convey. I checked how So, in short, my bug report is not about behavior of |
IMO, this should not be default. I would not mind adding a config field in containers.conf to enable this more minimal events output but we should not (and, as @Luap99 pointed out, cannot without a major version) do this by default. |
@harish2704 Our API socket is split in two parts the normal docker api endpoint and the our libpod endpoints that all start with /version/libpod/... so all the other docker compatiable endpoints can and should be changed to match docker api as closely as possible. Look into pkg/api/handlers/compat/events.go there we use the code for both endpoints but if you look there into the logic you will find |
Cockpit tests failed for commit 5b8d32d26bc1eebc45d82b40e9866a0459a6a500. @martinpitt, @jelly, @mvollmer please check. |
Cockpit tests failed for commit 21b6aa68c0f65c4385fad0941c08fa14e9de8dfb. @martinpitt, @jelly, @mvollmer please check. |
21b6aa6
to
3ec4a74
Compare
Emit event only if there is a change in health_status Fixes containers#24003 Resolves containers#24005 (comment) Pass additional isChanged flag to event creation function Fix health check events for docker api Signed-off-by: Harish Karumuthil <[email protected]>
3ec4a74
to
cecdca7
Compare
@Luap99
ie, if we run |
Signed-off-by: Harish Karumuthil <[email protected]>
Cockpit tests failed for commit 3ec4a744dbd9f95e5667b242f58ec110dc4049bc. @martinpitt, @jelly, @mvollmer please check. |
Cockpit tests failed for commit cecdca7. @martinpitt, @jelly, @mvollmer please check. |
Cockpit tests failed for commit 4597408. @martinpitt, @jelly, @mvollmer please check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need some API level tests to ensure it working, have a look at test/apiv2
// Whether state change happened for HealthStatus or not | ||
IsHealthStatusChanged bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be at least omitempty like other fields so it doesn't show up on non hc events. Maybe it would be better to add this into into the Attributes map instead, @mheon WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By reading the code, I understand that attribute map is fully allocated to storing container's labels
That is why I decided not to choose that. ( Initially I thought to store this in attributes map )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (c *Container) newContainerHealthCheckEvent(healthStatus string) { | ||
if err := c.newContainerEventWithInspectData(events.HealthStatus, healthStatus, false); err != nil { | ||
func (c *Container) newContainerHealthCheckEvent(healthStatus string, isHcStatusChanged bool) { | ||
if err := c.newContainerEventWithInspectData(events.HealthStatus, events.EventMetadata{HealthStatus: healthStatus, IsHealthStatusChanged: isHcStatusChanged}, false); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause conflicts with #23900, not sure what is best here and which one should be merged first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would merge #23900 first. I think this PR will need more care.
@@ -93,6 +93,10 @@ func GetEvents(w http.ResponseWriter, r *http.Request) { | |||
if evt == nil { | |||
continue | |||
} | |||
if evt.Status == events.HealthStatus && !evt.IsHealthStatusChanged{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a !utils.IsLibpodRequest(r) check like the other docker compat changes below as this function is used for both the docker and libpod endpoint.
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
A friendly reminder that this PR had no activity for 30 days. |
Emit
health_status
event only if there is a change in health_statusFixes #24003
Does this PR introduce a user-facing change?