-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: ✨ Flexible Nova microversions #1567
base: main
Are you sure you want to change the base?
WIP: ✨ Flexible Nova microversions #1567
Conversation
Skipping CI for Draft Pull Request. |
✅ Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
for i := range recognizedMicroversions { | ||
if recognizedMicroversions[i].ID == "2.1" { | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the 2.1 handling here? I Think we have a default (2.1) then choose a best fit (2.53, 2.60 ..)
just don't understand the special handle here..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! I'm not sure if this is how we should do it, but I can explain what I'm trying to do here.
Before we set the minimum version we didn't set any version at all. If I understand correctly that would mean the same thing as setting 2.1.
If no version is specified then the API will behave as if a version request of v2.1 was requested.
From https://docs.openstack.org/nova/latest/reference/api-microversion-history.html
So what I'm trying to do here is to allow fallback to the default if possible. If tags are used, then we require 2.53 minimum so then I remove 2.1 from the versions that it can pick from. Probably a better way to do it would be to convert the versions to floats and then remove anything that is < 2.53, but for now I'm just playing with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incidentally however we do it, I'd probably go with ints rather than floats and drop the 2.
.
I wonder if this could be something like:
compute := s.getComputeClient()
if len(instanceSpec.Tags) > 0 {
err := compute.RequireMicroVersion(NovaTagging)
if err != nil {
...tagging not supported...
}
}
We could have similar for neutron extensions.
Thoughts:
- how can we inform the user up front, if their configuration requires more than their cloud supports?
- If we can't do it up front, how do we make the best UX we can in the failure case?
- How do we do the 'we need the multi-attach microversion' thing, because we can't detect that from k8s objects?
On the latter point: we could pre-fetch the specified volume-type, or infer the default volume type (don't know if this is possible) and check if multi-attach is set. Or we could react to a failure by trying again with a higher microversion, but the granularity of OpenStack error code likely makes this messy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how can we inform the user up front, if their configuration requires more than their cloud supports?
I think upfront is impossible here. We don't do any API calls in the webhook and for good reason, so I don't see how we would be able to do it unfortunately.
If we can't do it up front, how do we make the best UX we can in the failure case?
Probably the best we can do here is to bubble up the error to the CAPI level where it will be visible on the Machine or the Cluster.
How do we do the 'we need the multi-attach microversion' thing, because we can't detect that from k8s objects?
On the latter point: we could pre-fetch the specified volume-type, or infer the default volume type (don't know if this is possible) and check if multi-attach is set. Or we could react to a failure by trying again with a higher microversion, but the granularity of OpenStack error code likely makes this messy.
Now we are getting to the core of the issue! I think the reasonable thing to do is to keep a list of supported versions with higher priority for higher versions (as seen in this PR). That way the highest supported version would be picked (I hope), which should support as many features as possible. My thinking is that this would solve the issue without complicating the code with extra checks, errors and retries.
However, it builds on the assumption that picking the highest supported version is always best and I'm not completely sure if this is true. Could there be a case similar to the multi-attach volume situation, where OpenStack is configured in a way that a microversion is in theory supported, but in practice a lower version is needed because of specific-volume-type-or-similar
set up by the admins? I want to think the answer is no and that it would be safe to always pick the highest available versions.
pkg/clients/compute.go
Outdated
var NovaSupportedVersions = []*utils.Version{ | ||
{ID: "2.1", Priority: 10, Suffix: "/v2.1/"}, // Initial microversion release, same as specifying no version | ||
{ID: "2.53", Priority: 20, Suffix: "/v2.53/"}, // Maximum in Pike | ||
{ID: "2.60", Priority: 30, Suffix: "/v2.60/"}, // Maximum in Queens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we forced to 2.60 at commit 2df3778
so I think at some specific call we should use that version ,but I didn't see we have such negotiation
in the following code, can you point me where's the logic that choose 2.60 after the change?Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version is chosen here. It will check what versions are supported by the server and then pick a version from this list that is supported. I also made sure to put highest priority for 2.60 so this is what should be preferred if it is available.
pkg/clients/compute.go
Outdated
CAPO supports multiattach volume types, which were added in microversion 2.60. | ||
*/ | ||
const NovaMinimumMicroversion = "2.60" | ||
var NovaSupportedVersions = []*utils.Version{ | ||
{ID: "2.1", Priority: 10, Suffix: "/v2.1/"}, // Initial microversion release, same as specifying no version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd put the bar higher here, it's hard to correlate but we might be implicitly depending on some features only available later. Starting from Newton 1 would make sense to me.
Footnotes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point! I picked 2.1 because that seems to be what is used if no version is specified, so this would have been the situation before CAPO specified a version. Basically this list then represents the history of CAPO:
- No version (=2.1)
- 2.53 (added as requirement for tags)
- 2.60 (added as requirement for multi-attach volumes)
That said, I agree that it could be a good idea to set the bar higher and 2.38 sounds like a good starting place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Newton is old enough , we should be good to set the bar to Newton
but I'd like to make it configuration variable so someone might change it if they really want
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I'd like to make it configuration variable so someone might change it if they really want
This was also my original suggestion. I wanted to make CAPO/Gophercloud pick up the microversion from clouds.yaml
. This way the user could set the version in there if needed and it would work the same way as for the OpenStack CLI. However, there were objections since it would be very easy to break things by setting versions that CAPO cannot work with. So now I'm trying to propose a way where CAPO still has control but it is a bit more flexible.
c38aa7f
to
cd74f5a
Compare
Updated! I added 2 more commits: one to "undo" the previous changes so it is easier to compare different approaches, and one with new changes. What I have done is basically this:
|
pkg/clients/compute.go
Outdated
if err != nil { | ||
return nil, fmt.Errorf("failed to negotiation compute service version: %w", err) | ||
} | ||
compute.Microversion = version.ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method needs to return a deterministic microversion. It can't be a negotiated version: it needs to be success or failure. I want my code to run with microversion X or nothing (ok, we're going to have to support 2 for volume creation/attachment, but... details).
How about:
- NewComputeClient always returns microversion 38.
- A new method
WithMicroversion(microversion int) (ComputeClient, error)
returns a copy of itself with the target microversion if it is supported?
So most code is unchanged, and the only effect is that it's now running against an older microversion.
Code which requires a newer microversion does something like:
{ // Lexical scope limits new compute client
compute, err := compute.WithMicroversion(TAGS)
if err ...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I'm not sure I understand. If we have deterministic microversion and no negotiation, then how would this return failure? Then we end up with the same as we already have where CAPO says the microversion is X and then we will see when making the first API call if that works or not. With negotiation we get to know this already here.
I can accept negotiating with a single microversion (so it is that microversion or nothing), but then I still think it is better to do the negotiation than just setting it without checking if it is supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I anticipate that we'd fetch the maximum supported microversion from Nova when initially creating the service client and store it in the service client. We would then use this every time we needed to check that the required microversion was supported. i.e.
On instantiating the service client we check that microversion 38 is supported because that's the minimum. This is the microversion that all code without a higher requirement will run against.
When adding tags to a server before creation we'd do something like:
createClient, err := compute.WithMicroversion(TAGS)
if err != nil {
createClient = compute
} else {
... add tags to server create opts ...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I can see a way forward with this!
cd74f5a
to
ab8d3bc
Compare
ab8d3bc
to
178d4d0
Compare
/test pull-cluster-api-provider-openstack-e2e-test |
1 similar comment
/test pull-cluster-api-provider-openstack-e2e-test |
178d4d0
to
746cf67
Compare
/test pull-cluster-api-provider-openstack-e2e-test |
/test pull-cluster-api-provider-openstack-e2e-test |
7433bf2
to
bc97d3e
Compare
It probably makes sense to move some of this to gophercloud. See gophercloud/gophercloud#2791 |
/test pull-cluster-api-provider-openstack-build |
bc97d3e
to
f931f03
Compare
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
21df213
to
b584752
Compare
/test pull-cluster-api-provider-openstack-build |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove-lifecycle rotten |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
b584752
to
05bbd05
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test pull-cluster-api-provider-openstack-build |
This is an attempt to set the microversion based on what features are needed. For example, if tags are defined for the server, then we need a microversion that supports tags. If no special features are used/needed then we use the fixed minimum version
05bbd05
to
3ab8a84
Compare
/test pull-cluster-api-provider-openstack-build |
/test pull-cluster-api-provider-openstack-e2e-test |
This is related to the proposal in #1565 and is just meant as a discussion starter for now. If you have any comments, ideas or suggestions, please feel free to comment here or in #1565.
What this PR does / why we need it:
Use a list of supported versions instead of a single hard coded one. The list can be filtered based on specific feature requirements (e.g. usage of server tags). The version to use is then picked from the intersection of this list and what the server supports.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #1448
Special notes for your reviewer:
TODOs:
/hold