Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add operator go client #731

Merged
merged 27 commits into from
Jun 23, 2023
Merged

Conversation

bstadlbauer
Copy link
Collaborator

@bstadlbauer bstadlbauer commented Jun 12, 2023

TL;DR

This PR ports all the go client code from https://github.com/bstadlbauer/dask-k8s-operator-go-client to this repo.

Long description

go uses the terms package and module the exact opposite way from the Python terminology. A module is a collection of packages in go. The following description uses the go notation.

I've added all the go code to ./dask_kubernetes/operator/go_client/, except for the go.mod and go.sum file, as those are usually at the module root, as each module is a git repository which has one combined version in the form of a git tag.
Most of the user written code is in ./dask_kubernetes/operator/go_client/pkg/apis/kubernetes.dask.org/v1/types.go which should be in-sync with the CRDs. All code in ./dask_kubernetes/operator/go_client/pkg/client is autogenerated by ./dask_kubernetes/operator/go_client/hack/regenerate-code.sh.

I've setup auto-generation and linting to run within Docker, so it should be relatively easy for a pure-Python developer to contribute without going through all the intricacies of setting up go on their machine. Both auto-generation as well as linting are part of pre-commit, guarded to only trigger on changes to the go files.

What I've realized too late is that go is very opinionated about versioning and is built around SemVer (more info here). Each major bump is in-fact a new module in go. I've worked around this by starting with major version v2023, but that also means we'll have to update the package version each year. A how-to is in the docs.

I've tested this by compiling flyteplugins against a local version of this, and all tests pass.

Closes #529

@bstadlbauer bstadlbauer marked this pull request as ready for review June 12, 2023 08:39
CODEOWNERS Outdated
Comment on lines 2 to 7
dask_kubernetes/operator/customresources/ @bstadlbauer

# Go operator
go.mod @bstadlbauer
go.sum @bstadlbauer
dask_kubernetes/operator/go_client/ @bstadlbauer
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jacobtomlinson I've added myself as a code-owner to be notified in case of any changes. Happy to revert though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me! It's my understanding that CODEOWNERS will match the last value, so this is effectively removing me as code owner of these files. I think it would be good to specify me for at least the customresources directory too.

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for moving this over here, I think this makes sense.

I wonder if we could add any tests for this? Perhaps a toy Go application that gets built in CI?

name: check-go-generation
entry: dask_kubernetes/operator/go_client/hack/regenerate-code.sh
language: script
files: ^dask_kubernetes/operator/go_client/pkg/apis/.*\.go$
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also want to trigger this if we modify the CRD definitions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we get any benefit there? As a user would have to manually keep the go files in sync. So in theory, the code-generation should always be valid in case no go files are touched?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, I assumed the go files were autogenerated from the CRDs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly not! I looked around a bit and seems like that's not possible at the moment :-(

CODEOWNERS Outdated
Comment on lines 2 to 7
dask_kubernetes/operator/customresources/ @bstadlbauer

# Go operator
go.mod @bstadlbauer
go.sum @bstadlbauer
dask_kubernetes/operator/go_client/ @bstadlbauer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me! It's my understanding that CODEOWNERS will match the last value, so this is effectively removing me as code owner of these files. I think it would be good to specify me for at least the customresources directory too.

@@ -0,0 +1,49 @@
module github.com/dask/dask-kubernetes/v2023
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My go skills are very intermediate, but could you say a little more about the v2023 suffix here? I've not seen this kind of thing before.

Copy link
Collaborator Author

@bstadlbauer bstadlbauer Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think my skills are a lot better here, I'm probably just a few hours of googling ahead :-D

This is something that I had to do to allow for a major version != 0 or 1. Without this, a dependent module would not be able to specify say:

require(
    github.com/dask/dask-kubernetes v2023.3.2
) 

as this would fail with version "v2023.3.2" invalid: should be v0 or v1, not v2023 syntax. After finding golang/go#35732 and reading the associated docs, I've realized that each breaking change (for >=2.0.0) is realized in the form of a new module, postfixed with the major version (as here).

I'm not a particular fan of this given that the dask ecosystem uses CalVer, but go is built around SemVer, so I don't think there's much we can do if we want to have things in the same repo. It also means that we'll have to increase this major version each year, which doesn't seem ideal.

Happy to reconsider whether we should move the go code here in case you have concerns

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks for the clarification. I never knew that about versions >1. My thoughts on CalVer are well documented, but it's the situation we are in.

I wonder if we could add some automation to alert us when this gets out of sync. I can totally imagine myself tagging 2024.1.0 and forgetting to update this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting post! I've been following the github discussion around why dask switched to CalVer, so that's great insight in how it's turned out. The user perspective in that post around communicating breaking changes definitely resonates.

Instead of an automation that alerts us, I've built this automation script that updates the version automatically each year. Happy to switch to a simple reminder in case you don't want the added complexity.

@bstadlbauer
Copy link
Collaborator Author

@jacobtomlinson Thanks for the note about the codeowners, though you'd be included anyway. Added you to all files in 5e60fcc.

As for testing, which kinds of tests would you imagine? Would we just like to know whether things compile or an actual test that checks whether all CRD types look as expected? 🤔

@jacobtomlinson
Copy link
Member

I think just compiling would be fine for now.

@bstadlbauer bstadlbauer force-pushed the add-operator-go-client branch 2 times, most recently from f15eeb9 to 0e701df Compare June 16, 2023 13:22
@bstadlbauer
Copy link
Collaborator Author

@jacobtomlinson Made the discussed changes:

  • Added a github workflow to automatically change the major version each year (here). It runs on the 2nd of January every year (to avoid running it on a github server where it might still be Dezember 31st). I've tested this, example PR in Update major go package version bstadlbauer/dask-kubernetes#1
  • I've added a pre-commit hook to run go build ./...

@bstadlbauer
Copy link
Collaborator Author

Also not quite sure why the tests are failing - is this expected?

@jacobtomlinson
Copy link
Member

Thanks for merging from main. The CI should be unblocked now.

@bstadlbauer
Copy link
Collaborator Author

Great, CI is looking good 👍 Should be ready to review

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome let's get this merged.

@jacobtomlinson jacobtomlinson merged commit b4d9cc4 into dask:main Jun 23, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Go client for the k8s operator
2 participants