Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to controller-runtime 0.19.0 #8020

Merged
merged 20 commits into from
Aug 26, 2024
Merged

Conversation

naemono
Copy link
Contributor

@naemono naemono commented Aug 20, 2024

Changes Detailed

There are additional breaking changes with controller-runtime 0.19.0, which mostly come down to this breaking PR. This integrates those changes into the existing codebase.

Additional changes

Was forced to update controller-tools, see failures
Also had to bump the memory from 2Gi to 4Gi (I tried 3Gi):
image

Signed-off-by: Michael Montgomery <[email protected]>
@botelastic botelastic bot added the triage label Aug 20, 2024
@naemono naemono added the >enhancement Enhancement of existing functionality label Aug 20, 2024
@botelastic botelastic bot removed the triage label Aug 20, 2024
Copy link
Contributor

@adamkasztenny adamkasztenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
naemono and others added 4 commits August 20, 2024 13:24
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
pkg/controller/common/watches/state.go Outdated Show resolved Hide resolved
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
@naemono naemono requested a review from pebrc August 20, 2024 22:28
Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codewise looks good. There seems to be an issue though with the required annotation on the elasticsearchRefs in the monitoring subsection in the Elasticsearch CRD, probably something has changed in controller-tools, we need to double check what is going on there and if other CRDs are affected as well.

@@ -566,6 +560,8 @@ spec:
type: string
type: object
type: array
required:
- elasticsearchRefs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unexpected

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so we are on the same page, the top level field Monitoring is set to optional:

// +kubebuilder:validation:Optional

The Metrics and Logs fields are also set to optional:
// +kubebuilder:validation:Optional

And the ElasticsearchRefs for both of these are set to required:
// +kubebuilder:validation:Required

So maybe this is a bug fix that it's now doing what it's supposed to do? I'm still investigating, and looking through issues upstream, and will test to see if this causes any issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, here we go:
kubernetes-sigs/controller-tools#943
kubernetes-sigs/controller-tools#944
Here's what we have:

	// +kubebuilder:validation:Required
	ElasticsearchRefs []ObjectSelector `json:"elasticsearchRefs,omitempty"`

Which is both omitempty and Required, which used to be treated as optional, is now explicitly required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With something like the following, we get an expected failure:

spec:
  version: 8.15.0
  monitoring:
    metrics: {}
    logs: {}

Failure:

❯ kc apply -n default -f config/custom/elasticsearch-with-monitoring.yaml
elasticsearch.elasticsearch.k8s.elastic.co/elasticsearch-monitor-sink created
The Elasticsearch "elasticsearch-monitor-source" is invalid:
* spec.monitoring.logs.elasticsearchRefs: Required value
* spec.monitoring.metrics.elasticsearchRefs: Required value

And adjusting to this:

spec:
  version: 8.15.0
  monitoring:
    metrics:
      elasticsearchRefs:
      - name: elasticsearch-monitor-sink
        namespace: default
    logs:
      elasticsearchRefs:
      - name: elasticsearch-monitor-sink
        namespace: default

Gets the expected results:

❯ kc apply -n default -f config/custom/elasticsearch-with-monitoring.yaml
elasticsearch.elasticsearch.k8s.elastic.co/elasticsearch-monitor-sink unchanged
elasticsearch.elasticsearch.k8s.elastic.co/elasticsearch-monitor-source created

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, from the failure in the smoke tests:

        	            	Elasticsearch.elasticsearch.k8s.elastic.co "es-apm-sample-n9bk" is invalid: [spec.monitoring.logs.elasticsearchRefs: Required value, spec.monitoring.metrics.elasticsearchRefs: Required value]

There must be another permutation that could cause issues... checking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pebrc I've found 2 paths to resolve this, none of which I particularly like:

  1. Set monitoring.*.elasticsearchRefs explicitly to optional, which was how it was being treated previously because of it also having omitEmpty in the struct tags.
  2. Set Elasticsearch.Spec.Monitoring to be a pointer. Now this brings up another whole can of worms, as we're talking about changing APIs that are V1, which has major red flags but it works (along with a slew of other code changes checking for nil). Now note the k8s api conventions concerning optional fields, which they indicate should be a pointer, or have built-in nil values [maps/slices], so should these have been a pointer from the beginning?

I've tried other options, such as the following, to no avail:

  • adding +optional to the parent optional fields.
  • removing omitEmpty from ElasticsearchRefs.

Copy link
Collaborator

@pebrc pebrc Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think making it +optional sounds about right to me, if this was effectively the previous behaviour. Making Monitoring a pointer would also be possible I think as long as the serialisation of the struct stays the same (which I think is the case) But a pointer to a struct containing a slice feels kind of awkward, the nil/empty slice is the more idiomatic way of expresssing the absence of a value.

I do suggest however to run the full set of e2e tests on this PR before we merge this in case we missed another case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kvalliyurnatt brought up a good point. In addition to setting this to optional, we could add some new validation such that the webhook could catch these being missing, and reject the change. I will also add this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is more difficult than initially anticipated as we use an non-pointer for the monitoring struct, so there's no difference between the user defining this without any logs/metrics associations, and it just being empty, so this is a check we can't do @kvalliyurnatt .

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for keeping the behaviour identical to the status quo (ignoring the difficulty you mentioned).

Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
@naemono
Copy link
Contributor Author

naemono commented Aug 22, 2024

buildkite test this -f p=gke -m s=7.17.8,s=8.15.0,s=8.16.0-SNAPSHOT

Comment on lines 532 to 534
This is not optional but was treated as such in previous version of kubernetes-sigs/controller-tools as we also specify
`omitempty` here. See https://github.com/elastic/cloud-on-k8s/pull/8020 and https://github.com/kubernetes-sigs/controller-tools/issues/943
for more details.
Copy link
Collaborator

@pebrc pebrc Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want that in the public API docs. I think there is a way to make it not show up there but keep it as a comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll investigate and update, thanks...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I've partially solved this. Adding the --- removes them from the rendered yaml, but unfortunately I don't think that the crd-ref-docs tool supports this, so it's in the rendered api-docs file....

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you put the comment above the monitoring struct, basically free-floating so that it doesn not get included. We can then in a second step modify the crd-ref-docs tool to follow the k8s convention of ignoring all comments after ---

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pebrc this has been done, and the changes look correct now. The previous full e2e test run: https://buildkite.com/elastic/cloud-on-k8s-operator/builds/9053 only had 2 failures, and they appear to be associated with the recent changes you fixed in #8021.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also created elastic/crd-ref-docs#93

Signed-off-by: Michael Montgomery <[email protected]>
Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@naemono naemono enabled auto-merge (squash) August 26, 2024 13:19
@naemono naemono merged commit 624e1b2 into elastic:main Aug 26, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement Enhancement of existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants