-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helm Operator: charts with Helm template function "lookup" always fail at sync release. #5728
Comments
Interesting. I will start looking into this and see if there is something reasonable that we can implement as a workaround. Thanks for bringing this issue to our attention! /assign |
Hi @mikeshng , Here is what I have found when digging into this issue a bit more: Reading the error it seems that this is where the error actually occurs: operator-sdk/internal/helm/release/manager.go Lines 129 to 132 in 1af83ad
Diving into the operator-sdk/internal/helm/release/manager.go Lines 155 to 161 in 1af83ad
In this I think the simplest workaround in this case would be for the user to implement a workaround in their templates. In the issue you referenced I saw that there seemed to be a workaround in the template to wrap the image: {{ (lookup "v1" "Namespace" "" "default").metadata.name }} would become something like: {{ if (lookup "v1" "Namespace" "" "default") }}
image: {{ (lookup "v1" "Namespace" "" "default").metadata.name }}
{{ else }}
image: "default"
{{ end }} I tested these changes with I also tested these changes by running the same commands that you provided in the additional context section and the operator ran as expected. I also don't see it anywhere in the docs, so it would probably be a good idea for us to make a docs change that states that when using Helm the following commands should run with no errors before attempting to use a helm chart in an operator:
All in all, I don't think there are any reasonable changes that we could make on our end to accomodate this and think it should be left up to the user to ensure that their Helm chart works properly when used in a dry run setting. I am new to the Operator SDK team and community, so I would like to leave this issue open to be discussed in our next community meeting on May 9th and see if anyone else with a bit more experience in the Helm operator realm has any insights into this. Thanks again for raising the issue! |
I agree. Installing then uninstalling doesn't seem like a good idea.
Thank you! This is exactly what I needed.
I feel like if those are the necessary validation steps then operator-sdk should probably be able to perform them without having to ask the user to run those manual steps.
I assume most end users are not familiar with the inner workings of operator-sdk. They might not know that dry-run Helm operations not working is going to cause Helm operator to not function properly. For example, the charts that I have can be helm install/upgrade/uninstall fine without dry-run so to most users, they will probably assume this chart should be fine for the Helm operator.
Thank you for your triage. Your suggestions are great! Much appreciated. |
During the community meeting it was decided to leave this open to see if any solutions to the dry-run issue could be formulated. As per my last comment, we seem to be pretty coupled to the functionality defined by Helm for checking if an upgrade would be valid via the dry run option. The most prevalent workaround from the referenced helm issue that Operator SDK could try implementing is to attempt to install and immediately uninstall the release. I don't feel like this would be the best way to go about this due to the potential domino effect that could occur if this process doesn't go smoothly. I think the simplest workaround in this case would be for the user to implement a workaround in their templates. In the issue referenced I saw that there seemed to be a workaround in the template to wrap functions that rely on cluster access to be wrapped in an if statement to nil check the object returned by the function. image: {{ (lookup "v1" "Namespace" "" "default").metadata.name }} would become something like: {{ if (lookup "v1" "Namespace" "" "default") }}
image: {{ (lookup "v1" "Namespace" "" "default").metadata.name }}
{{ else }}
image: "default"
{{ end }} I think having end users make sure that the following Helm commands run successfully on their Helm charts would prevent this issue from occurring in their Helm operators as they follow the same dry-run environment that the Helm operator follows:
@varshaprasad96 @fabianvf @joelanford what are your thoughts? |
I'd just echo what's been said:
|
We probably need docs for the workaround. @everettraven can you create another follow up issue to track that down. Meanwhile, since we have a solution here, I'm closing this issue. |
Helm 3.13+ has a new flag |
Reopened so we can look into the new dry-run=server option |
@holyspectral would you be open to submitting a PR to fix this? THanks! |
/assign |
@holyspectral I was hoping to cut this release today since it is quite overdue. As there's no open PR for this yet do you want to try to get one in today and I can wait until Monday to cut the release or should we punt this to the next release milestone? |
Hi @oceanc80 , sorry that I was traveling and missed your message. I should be able to work on this tomorrow. There will be a new parameter in watches.yaml to prevent breaking existing applications. Other than that it should be straightforward. I will be creating a PR in a few days. |
Bug Report
What did you do?
Using operator-sdk, I created a Helm operator for a Helm chart (for example) that uses the lookup Helm template function. The Helm operator is unable to sync the release and always fails with the error message:
This is likely due to the fact that Helm operator sync release performs a Helm upgrade dry-run. It seems like there is an ongoing Helm issue with dry-run and lookup. See
https://github.com/helm/helm/issues/8137
In the meantime, is there a workaround that the Helm operator can do so that it can unblock charts with "lookup" functions?
What did you expect to see?
The Helm operator is able to sync the release.
What did you see instead? Under which circumstances?
I see the error above continuously.
Environment
Operator type:
/language helm
Kubernetes cluster type:
kind
cluster$ operator-sdk version
$ kubectl version
Possible Solution
Additional context
When I ran the operator-sdk create api command, I already see the warning message that is the same as the sync release failure message. Here is the full list of commands I ran to reproduce. The chart I used is here.
The text was updated successfully, but these errors were encountered: