-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] feat: port the NodeUnschedulable in-tree plugin to wasm #123
base: main
Are you sure you want to change the base?
[WIP] feat: port the NodeUnschedulable in-tree plugin to wasm #123
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dejanzele The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
%/main.wasm: %/main.go | ||
@(cd $(@D); tinygo build -o main.wasm -scheduler=none --no-debug -target=wasi .) | ||
|
||
.PHONY: build-tinygo | ||
build-tinygo: examples/nodenumber/main.wasm examples/advanced/main.wasm examples/imagelocality/main.wasm guest/testdata/cyclestate/main.wasm guest/testdata/filter/main.wasm guest/testdata/score/main.wasm \ | ||
build-tinygo: examples/nodenumber/main.wasm examples/advanced/main.wasm examples/imagelocality/main.wasm examples/nodeunschedulable/main.wasm guest/testdata/cyclestate/main.wasm guest/testdata/filter/main.wasm guest/testdata/score/main.wasm \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not understand why the internal/e2e/scheduler_perf/wasm/nodeunschedulable
can be compiled successfully with the %/main.wasm
but the examples/nodeunschedulable
one which is almost identical fails and needs the -gc=custom -tags=custommalloc
flags.
094402b
to
ac49ac2
Compare
@@ -1,5 +1,16 @@ | |||
package api | |||
|
|||
type QueueingHint int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This refers to https://github.com/kubernetes/kubernetes/blob/master/pkg/scheduler/framework/types.go#L256-L263. Where should this be placed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this in the first place? We don't support QueueingHint in the extension yet.
) | ||
|
||
// TolerationsTolerateTaint checks if taint is tolerated by any of the tolerations. | ||
func TolerationsTolerateTaint(tolerations []*protoapi.Toleration, taint *protoapi.Taint) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This refers to this function https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/component-helpers/scheduling/corev1/helpers.go#L63-L70.
Is there a better place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should put it (along with ToleratesTaint(...)
) inside examples/nodeunschedulable.
I don't think we should maintain this function as the part of the guest SDK.
// 3. Empty toleration.key means to match all taint keys. | ||
// If toleration.key is empty, toleration.operator must be 'Exists'; | ||
// this combination means to match all taint values and all taint keys. | ||
func ToleratesTaint(toleration *v1.Toleration, taint *v1.Taint) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This refers to this function https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/api/core/v1/toleration.go#L38-L57.
As the kubernetes folder is sparsely checked out and I guess should not be modified, is there a better approach how to use this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, kubernetes folder is only for generating the object files. Also, we cannot import the things from kubernetes because of the limitation coming from TinyGo.
So, yes, it's the only option that we almost do a copy like you do here, at least until we move to Golang SDK.
@@ -5,11 +5,11 @@ profiles: | |||
multiPoint: | |||
enabled: | |||
- name: wasm | |||
- name: NodeNumber | |||
- name: NodeUnschedulable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was edited so I can test does it work.
I'll fix it with a better config.
Here is an AI generated summary of the benchmark report (should be taken with a bit of reservation): Report on Scheduler Performance: Go In-tree vs. WASM PluginScheduling Throughput (pods/s):
Scheduling Attempt Duration (ms):
Pod Scheduling Duration (ms):
Extension Point Durations (ms):
High Percentile Analysis (Perc99):
|
Thanks for the PR @dejanzele. |
I am also heading to KubeCon, we can discuss it there :) |
@Gekko0114 @utam0k can you take a look as well? |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@@ -1,5 +1,16 @@ | |||
package api | |||
|
|||
type QueueingHint int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this in the first place? We don't support QueueingHint in the extension yet.
) | ||
|
||
// TolerationsTolerateTaint checks if taint is tolerated by any of the tolerations. | ||
func TolerationsTolerateTaint(tolerations []*protoapi.Toleration, taint *protoapi.Taint) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should put it (along with ToleratesTaint(...)
) inside examples/nodeunschedulable.
I don't think we should maintain this function as the part of the guest SDK.
// 3. Empty toleration.key means to match all taint keys. | ||
// If toleration.key is empty, toleration.operator must be 'Exists'; | ||
// this combination means to match all taint values and all taint keys. | ||
func ToleratesTaint(toleration *v1.Toleration, taint *v1.Taint) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, kubernetes folder is only for generating the object files. Also, we cannot import the things from kubernetes because of the limitation coming from TinyGo.
So, yes, it's the only option that we almost do a copy like you do here, at least until we move to Golang SDK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this? I mean, can we just use the wasm built from examples/nodeunschedulable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert all the changes in scheduler-perf test cases.
If you want to include the nodeunschedulable plugin in the scheduler-perf, it's better to create a new test case, not replacing the existing test cases to use nodeunschedulable instead of nodenumber.
nodenumber is the simplest plugin which is good to measure the overhead of wasm extension.
What type of PR is this?
/kind feature
What this PR does / why we need it:
Ports the in-tree NodeUnschedulable plugin
Which issue(s) this PR fixes:
Relates #101
Special notes for your reviewer:
This is a work in progress, I have made it work locally, but this is not finished yet. I have created this PR to get early feedback as I am not sure some things need to be handled the way I handled them.
I need to add more tests, will do that in the next couple of days.
Does this PR introduce a user-facing change?
What are the benchmark results of this change?