Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't require registry during reboot and upgrade #1483

Conversation

jhernand
Copy link
Contributor

This patch adds an enhancement that describes the need to be able to boot and upgrade clusters without a registry server when all the required images have already been pulled.

Related: https://issues.redhat.com/browse/OCPBUGS-13219
Related: #1481
Related: openshift/cluster-network-operator#1803

This patch adds an enhancement that describes the need to be able to
boot and upgrade clusters without a registry server when all the
required images have already been pulled.

Related: https://issues.redhat.com/browse/OCPBUGS-13219
Related: openshift#1481
Related: openshift/cluster-network-operator#1803
Signed-off-by: Juan Hernandez <[email protected]>
approvers:
- "@sdodson"
- "@zaneb"
- "@LalatenduMohanty"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as with the other EP, can we get this down to a single approver? the other two can be reviewers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, who should be that approver?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well since you've got 3 candidates listed, i suggest you ask them who amongst them would like to take point and if none of them feel they are the right person hopefully they can make another suggestion.

Copy link
Contributor Author

@jhernand jhernand Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sdodson @zaneb @LalatenduMohanty do you want to be the approver for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can not be approver for this enhancement as this does not need any changes in clusterversion object or cluster version operator.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having read through the EP am I correct in summarizing this as a CI testing effort only? We put in place CI jobs that ensure that e2e tests pass without any remote registry access and we're done?

Do we expect a future coordination EP between CVO and PinnedImageSet implementation that would allow something like the PinnedImageSet to assert that a given release's pullspec has been mirrored and then a webhook or pre-apply step could be added to the CVO to go check the PinnedImageSet for it to assert that all nodes have mirrored the desired content?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is mostly about testing in CI that we can reboot nodes and do upgrades without a registry server. Note that it is not about ensuring that existing end to end tests are able to run without a registry server, rather introducing new tests to verify these specific cases: reboot and upgrade. For example, we may have an end to end test that runs a workload with images that don't exist in the cluster yet. That will require a registry server to pull those images. We are not trying to address that.

Yes, once this and #1481 are accepted we will work on a new EP (based on the now closed #1432) that will introduce the orchestration of an upgrade without a registry server.

This patch adds YAML comments explaining what is the reason to add
specific reviewers.

Signed-off-by: Juan Hernandez <[email protected]>
This patch removes the mention of the test to scan source code for the
`Always` pull policy, as that would be hard to implement and unreliable
for pods that are dynamically generated by operators.

Signed-off-by: Juan Hernandez <[email protected]>
This patch adds a paragraph to the non-goals section to explicitly state
that eliminating the requirement for a registry for other operations is
a non-goal.

Signed-off-by: Juan Hernandez <[email protected]>
@jhernand jhernand force-pushed the dont_require_registry_during_boot_and_upgrade branch from 3425309 to c77ae4a Compare September 26, 2023 07:58
This patch adds a paragraph to the open questions section about image
streams. Do they block upgrades if there is no registry server
available?

Signed-off-by: Juan Hernandez <[email protected]>
@oourfali
Copy link

@bparees @jhernand any items left here?
@bparees can you please re-review?

Copy link
Contributor

@bparees bparees left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a couple nits, otherwise it lgtm.

Signed-off-by: Juan Hernandez <[email protected]>
@oourfali
Copy link

@bparees Thanks! I see Juan addressed the comments.
Can we merge this one?

@bparees
Copy link
Contributor

bparees commented Oct 11, 2023

@bparees Thanks! I see Juan addressed the comments.
Can we merge this one?

that's up to approver(s) you listed in the EP

@oourfali
Copy link

@sdodson @mrunalp do you approve as well?

This patch changes the document to use `reboot` instead of `boot` to
make it clear that this is about reboots of existing nodes, not about
initial boots of new nodes.

Signed-off-by: Juan Hernandez <[email protected]>
@jhernand jhernand changed the title Don't require registry during boot and upgrade Don't require registry during reboot and upgrade Oct 13, 2023
@oourfali
Copy link

@sdodson @mrunalp can you please review (and approve?).

upgrades less fragile we should have a test that gates the OpenShift release and
that verifies that reboots of nodes and upgrades can be performed without a
registry server. One way to ensure this is to run in CI an admission hook that
rejects/warns about any spec that uses the `Always` pull policy.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth thinking about adding something to the test that blocks access to the registry till the update is done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was the only way I could think of to test this. Blackhole registry.access.rdhat.com registry.redhat.io and quay.io after images are pinned to all nodes, rolling restart of all nodes or upgrade.

@mrunalp
Copy link
Member

mrunalp commented Oct 20, 2023

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 20, 2023
@sdodson
Copy link
Member

sdodson commented Oct 20, 2023

/approve
I don't mind approving this, but my approval should not imply agreement to complete the implementation of these jobs by any specific team. Who builds them and who monitors them needs to be a specific follow up discussion.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 20, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sdodson

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 20, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 20, 2023

@jhernand: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci openshift-ci bot merged commit 955c042 into openshift:master Oct 20, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants