-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automated backport of #2824: Wait for node readiness before starting route-agent #2825
Merged
tpantelis
merged 1 commit into
submariner-io:release-0.16
from
skitt:automated-backport-of-#2824-origin-release-0.16
Sep 22, 2023
Merged
Automated backport of #2824: Wait for node readiness before starting route-agent #2825
tpantelis
merged 1 commit into
submariner-io:release-0.16
from
skitt:automated-backport-of-#2824-origin-release-0.16
Sep 22, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Route-agent startup is subject to races with OVN components: it mounts the sockets from the host, and since the default host path mount behaviour is to create a directory if the path is missing, if route-agent pod initialisation happens before OVN has opened the sockets, they get created as directories. This blocks socket creation and OVN fails to start. This wouldn't be a problem for most pods because they have tolerations ensuring the node is ready (including CNI readiness) before they start. The route-agent however has tolerates all taints, to ensure it runs everywhere; this means it starts as soon as a node is ready. There is no way to set up tolerations "except" a specific taint, so the route-agent can't be specified in such a way that it will start with any taint except node readiness or network availability. It also isn't possible to handle this by specifying the socket host path type; this causes the scheduler to wait for the socket to be available before starting the pod. The route-agent needs to be able to mount a number of different socket paths, to handle different setups, and there is never a configuration where all socket paths are available; so enforcing a socket type prevents the route-agent from starting at all. To handle this, an init container is set up for the route-agent, and waits until the node is ready before allowing the route-agent setup to continue. This init container does not specify the host path volumes used by the main container, so the corresponding paths aren't touched on the host. As a result, the route-agent is only set up once the node is fully ready, including OVN sockets, so the appropriate sockets are mounted correctly. (Directories are still created for missing socket mounts, but that doesn't matter, because once this stage is reached the missing socket mounts correspond to paths which aren't used by OVS or OVN.) Signed-off-by: Stephen Kitt <[email protected]>
skitt
requested review from
Oats87,
sridhargaddam and
tpantelis
as code owners
September 22, 2023 14:26
🤖 Created branch: z_pr2825/skitt/automated-backport-of-#2824-origin-release-0.16 |
tpantelis
approved these changes
Sep 22, 2023
aswinsuryan
approved these changes
Sep 22, 2023
This PR/issue depends on: |
🤖 Closed branches: [z_pr2825/skitt/automated-backport-of-#2824-origin-release-0.16] |
dfarrell07
added
the
release-note-needed
Should be mentioned in the release notes
label
Sep 26, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
automated-backport
ready-to-test
When a PR is ready for full E2E testing
release-note-handled
release-note-needed
Should be mentioned in the release notes
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport of #2824 on release-0.16.
#2824: Wait for node readiness before starting route-agent
For details on the backport process, see the backport requests page.
Depends on submariner-io/submariner#2722