-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
neonvm: add readiness probe for sysfs scaling #1190
base: main
Are you sure you want to change the base?
Conversation
No changes to the coverage.
HTML Report |
e5c9385
to
249d38b
Compare
249d38b
to
1ea3177
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, left a couple questions
::respawn:/neonvm/bin/udhcpc -t 1 -T 1 -A 1 -f -i eth0 -O 121 -O 119 -s /neonvm/bin/udhcpc.script | ||
::respawn:/neonvm/bin/udevd | ||
::wait:/neonvm/bin/udev-init.sh | ||
::respawn:/neonvm/bin/acpid -f -c /neonvm/acpi | ||
::respawn:/neonvm/bin/vector -c /neonvm/config/vector.yaml --config-dir /etc/vector --color never | ||
::respawn:/neonvm/bin/chronyd -n -f /neonvm/config/chrony.conf -l /var/log/chrony/chrony.log | ||
::respawn:/neonvm/bin/sshd -E /var/log/ssh.log -f /neonvm/config/sshd_config | ||
::respawn:/neonvm/bin/neonvmd --addr=0.0.0.0:25183 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I thought I covered it in the PR description.
We rely on neonvmd as a source of info for the readiness check.
For example if we start neonvmd first in the VM it might be that neonvmd is ready and serving requests but acpid or udevd have not yet even started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC then, the idea behind moving neonvmd down is to give other daemons a chance to start first, right?
Is that actually guaranteed?
default: | ||
panic(fmt.Errorf("unknown pod phase: %q", pod.Status.Phase)) | ||
} | ||
} | ||
|
||
// isRunnerPodReady returns whether the runner pod is ready respecting the readiness probe of its containers. | ||
func isRunnerPodReady(pod *corev1.Pod) runnerStatusKind { | ||
if pod.Status.ContainerStatuses == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Is there a difference between == nil
vs len(...) == 0
? I was a little surprised to see this check on a slice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You right, I don't think there is any difference for a slice if we use == nil
or len(...) == 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nil slices and non-nil empty slices are two different things: https://go.dev/play/p/UtIWK670wxW
I always use len(...) == 0
to cover both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually tend use != nil or len(slice) depending on the semantics of a particular code (like if I am interested in length or emptiness for sure) but I am ok with using any option here.
bdc6c19
to
61c45f4
Compare
Call runner /ready which, in sysfs scaling mode, proxifies to daemon's /cpu to check if runner pod and vm is ready. Runner's endpoint /ready does nothing in case of qmp scaling model. Move neonvm-daemon line in the inittab to start it right before vmstart. Modify logic in the migration controller to not waiting for the pod readiness - neonvm-daemon doesn't start until the migration is finished. De-facto, that doesn't change behavior for the migration at all since before the PR we had no readiness probe. Signed-off-by: Mikhail Sakhnov <[email protected]>
61c45f4
to
e71bf59
Compare
Call runner /ready which, in sysfs scaling mode, proxifies to daemon's /cpu to check if runner pod and vm is ready. Runner's endpoint /ready does nothing in case of qmp scaling model.
Move neonvm-daemon line in the inittab to start it right before vmstart.
It is a blocker for #1141 because working without ready probe introduces race in some e2e tests and kuttl tries to execute shell command
kill
while VM is not fully booted and acpid rules are not in action yet.It closes #1147 which is in the backlog but since it is a blocker I extracted it from the PR #1141