-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VMs wouldn't power on when using a storage policy #1601
Comments
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
Since there has been no movement on the govmomi issue that's linked above, this would save you from re-adding the label over and over. |
Is the root-cause here that the used file https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/main/test/e2e/config/vsphere-dev.yaml#L153 or https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/main/test/e2e/config/vsphere-ci.yaml#L147 did refer a storage policy which did not exist in the environment? So there would be two issues:
|
I'd say the root cause is the upstream issue vmware/govmomi#2929, but still I opened this issue here because the upstream issue affects CAPV developers and it isn't obvious right away what is causing the problem. Both of your suggested points make sense to me @chrischdi. |
@johananl Wondering if this PR potentially solves/changes the issue: #2467
|
@sbueringer the govmomi bug was never addressed and at some point I got tired of doing Looking at #2467, we no longer call the problematic govmomi function, however since we don't know the root cause I can't be certain the problem is gone. Unfortunately, I don't have access to a vSphere environment on which I can try to reproduce this right now. As far as I'm concerned we can close this, especially given that there seems to be no traction here nor on the govmomi issue. |
Hm yeah. It's hard to tell without being able to reproduce it what the problem is. Was your environment similar to the one described on #2467?
|
Maybe, I'm not sure what "resource" means in this context. I also didn't have full visibility over the environment since it's managed by my employer for multiple purposes and therefore I didn't have admin privileges. |
Alright, thx! Let's close this issue for now. Please re-open in case you (or anyone else seeing this issue the future) observes this behavior with a CAPV version including #2467 |
/close |
@sbueringer: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
SGTM, thanks for consistently being a very reliable maintainer @sbueringer 🙏 |
Thank you! :) |
/kind bug
What steps did you take and what happened:
I tried running an e2e test using
GINKGO_FOCUS="\[PR-Blocking\]" make e2e
and the test timed out because the controller VM gets created but stays powered off. Further investigation showed that the CAPV controller never progresses past the following line:cluster-api-provider-vsphere/pkg/services/govmomi/service.go
Line 131 in a7b6edc
This in turns is caused by the fact that the following call never returns:
cluster-api-provider-vsphere/pkg/services/govmomi/service.go
Line 322 in a7b6edc
What did you expect to happen:
I expected the VM to power on and the test to proceed.
Anything else you would like to add:
I was able to isolate the bug and it's either a problem in
govmomi
or a problem on the vSphere server side. I've opened an upstream issue: vmware/govmomi#2929Workaround: Setting the
VSPHERE_STORAGE_POLICY
variable to an empty string makes the test converge.The bug seems to occur also when creating clusters manually, i.e. it isn't an e2e-specific thing.
Environment:
main
at d2494c3kubectl version
): n/a/etc/os-release
): n/aThe text was updated successfully, but these errors were encountered: