Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teleport will stay unhealthy upon dropping to zero resources while unhealthy #50375

Open
espadolini opened this issue Dec 18, 2024 · 0 comments
Labels

Comments

@espadolini
Copy link
Contributor

Expected behavior:
A temporarily unhealthy App/Kube/DB Teleport agent that goes down to zero matched resources to serve will report as healthy (as nothing can be wrong anymore).

Current behavior:
Since Teleport recovers from being unhealthy if a successful heartbeat happens, when there's no running heartbeats, Teleport will stay unhealthy forever.

This is especially problematic in the teleport-kube-agent chart on a rollout, since the new pod is the one most likely to become unhealthy, resulting in a stuck rollout.

Bug details:

  • Recreation steps: cause a Teleport agent to become unhealthy and then delete all dynamic resources matched by the agent; this was initially discovered by changing filters in a discovery service in kubernetes, causing mass deletion of apps (which will also likely cause the agent to report as unhealthy for a bit, because of There is no way to gracefully stop a single inventory control stream heartbeat #50237) and leaving the new agent with no apps to serve.
@espadolini espadolini added the bug label Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant