-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: avoid stopping nginx-agent service on package upgrade #352
fix: avoid stopping nginx-agent service on package upgrade #352
Conversation
✅ Deploy Preview for agent-public-docs canceled.
|
Tagging @thresheek @fblr @oxpa for the sake of visibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the addition of openrc for alpine looks good, but I'm not sure about changing the current behaviour of the agent during upgrade. We require the agent to be stopped by the user before upgrade and then started again after the upgrade by the user.
@dhurley this is quite counterintuitive and stumbling behavior; what was the rationale behind that decision? To elaborate on a context: we came across the situation when nginx-agent service silently stopped during the usual package upgrade process with the appropriate package manager (yum/dnf, apt, apk) on a large fleet of instances. There were no any errors from corresponding package scripts during the upgrade, and there were no any signs of broken agent's configuration etc - so from the administrator's point of view, upgrade went fine - except the fact that we ended up with stopped monitoring for all affected instances. I'm wondering where that requirement of manual stopping and starting came from. Agent's process can be restarted gracefully, so why prevent it to do so? |
Hi @defanator, actually it looks like our docs might be a bit out of date. Yes you actually shouldn't need to stop the agent before upgrading. When we upgrade the agent, the old version of the agent should continue to run and its not until the user restarts the agent that the new version of the agent is used. Sorry about the confusion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing the issue where the agent was being stopped during upgrades. I just have questions related to upgrading on alpine.
@dhurley such behavior will introduce an inconsistent setup when your running binary will no longer be available on disk, which is potentially just another delayed point of failure (aka time bomb): imagine a situation when users won't restart service after upgrade till system reboot or another component(s) upgrade that will involve restarting services affected by particular changed shared library - which is a common scenario in modern Ubuntu/Debian - in this situation, in case of any unexpected issues with agent, it will be impossible to just restart the service without doing rollback to previous package version. Restarting service on upgrade (and failing immediately if a restart was unsuccessful) solves this inconsistency + allows a user to be alarmed on possible issues as soon as possible. We spent a while building similar patterns for other nginx products (just noticing - given the fact agent's trying to follow established principles / support matrix of nginx/nginx-plus); I'm happy to elaborate more on this if required. |
@dhurley @oliveromahony any updates on this one? |
Previously it was stopped during the executing of package scripts due to incorrect behavior in prerm: https://www.debian.org/doc/debian-policy/ch-maintainerscripts.html#summary-of-ways-maintainer-scripts-are-called
This will end up with a new process being killed immediately by pkg: https://github.com/freebsd/pkg/blob/1.19.1/libpkg/scripts.c#L100-L103 https://github.com/freebsd/pkg/blob/1.19.1/libpkg/scripts.c#L244-L261 See also: freebsd/pkg#2128
While here, removed extra trailing spaces across the doc.
9bb401d
to
9d70b92
Compare
TWIMC, rebased this against latest
Testing included:
|
Proposed changes
This change introduces a number of improvements around packaging nginx-agent to ensure that the following behavior is maintained:
nginx-agent
service is called with stop argument on package removal.nginx-agent
service is called with restart argument on package upgrade if a service is running at the moment of package upgrade (previously, a service has just been stopped without any notice).nginx-agent
service from older package is not running, it will not be started automatically.While here, added OpenRC init scripts to make
service nginx-agent
command fully functional on Alpine Linux.Tested on:
Unfortunately, FreeBSD is failing to follow the 2nd rule (restart on upgrade if a service was running) due to limitation of the pkg package manager (see freebsd/pkg#2128 for more details).
Fixes #303.
Checklist
Before creating a PR, run through this checklist and mark each as complete.
CONTRIBUTING
documentmake install-tools
and have attached any dependency changes to this pull requestREADME.md
)