-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide alternate to downwardAPI for podmetadata volume/data #4616
Comments
Some of the code related to podmeta can be found in executor.go: func (we *WorkflowExecutor) monitorAnnotations(ctx context.Context) <-chan struct{} { I believe this is used to see if the pod has been annotated with "execution control" and the main container must therefore be terminated. This could be replaced by a Would you like to submit a PR? |
@simster7 you know this code well - anything to add? |
After a bit of an investigation - this is non-trivial. Each workflow pod running using the k8sapi executor creates a connection to the k8sapi to watch for changes to the pod. This means if a workflow has N pods, it has N connections. This solution does not scale and as a result, will not work for the We could move the responsibility of watching for annotation changes to the executor (not unreasonable), and they could then choose which method to use (k8sapi is already watching). However, I do wonder if there is a much simpler solution. If the downward API is not available, then we just log the error and do not monitor the execution control annotation. |
coming in with low context, this does seem like potentially the optimal solution. e.g. with k8sapi set as the executor, and watching, the podmetadata volume is just moot?
I am slightly confused by this, in that I don't understand the relationship to the monitoring of the execution control annotation, and the workflow actually working. I thought the annotations were somehow critical to the correct execution of the flow. Also, the error received is a low level k8s error about not being able to schedule work to any matching nodes because of the unsupported volume type, so we just somehow have to not attempt mounting that volume... unless I misunderstood what you meant by "just log the error" |
The annotations are used for terminating the pod - i.e. |
ah ok, I understand. I do occasionally manually re-submit a flow, and then potentially delete the flow (from the UI), but those cases are typically when I'm debugging flows / configs. We can honestly live without those functions for the stable production flows, if it means we can run on instantly scalable infra |
We've just spoken about the lack of init containers. We must have a way to delay the start of the main container, and this is init containers. Can you please link to the Azure issue? |
The azure docs on virtual nodes (quick linked to limitations section): https://docs.microsoft.com/en-us/azure/aks/virtual-nodes#known-limitations issue in azure-aci project I just created today to ask about potential support for downwardAPI volumes: virtual-kubelet/azure-aci#91 (no movement yet) And ah... I was not thinking about the (I'm likely just confused by the name Either way, yeah I don't know what to do about the actual |
For our needs, I can re-implement those |
So, I know we don't have a conceptual, generic solution for the init container beyond my willingness to re-implement our flows to avoid it completely, but as for the other main blocker, you mentioned there being a possible simpler solution (eg. having the optionality to not monitor the annotations). I'm still slightly confused just because when I look at the annotations in the cluster, they are extensive and include input params and whatnot. I'm not following how all that data being added to the annotations is just to terminate pods prematurely, but I'll obviously take your word for it :) But at any rate, how "simple" is it actually to do that? I don't have a go dev setup currently, and am a little rusty, but am motivated to try to find a path to get our flows going on virtual nodes. |
Fixed. |
Summary
Currently, the use of the
downwardAPI
to project thepodmetadata
volume blocks the use of argo workflows on nodes that don't have support for downwardAPI projections. This is currently the case for Azure'sVirtual Nodes
feature for AKS. I have worked through all the other issues (ie using thek8sapi
executor to get around the docker-sock problem), but this one is currently blocking.It's obvious why downwardAPI is being used (super convenient), but am wondering if it would be feasible to get this data propagated via some other mechanism when that api isn't available on a particular node/nodepool.
If feasible, would then expose this feature via config (e.g.
disableDownwardAPI
, assumedfalse
if not present in order to be backward compatible)EDIT: of equal import: virtual nodes in AKS also do not support initContainers. I know these limitations would ideally be solved at the AKS/virtual-kubelet level (and I've made a ticket over there as well), but am hoping argo could potentially be adapted faster than MSFT is likely to move on any of this. I can probably take a stab at a PR if someone with more knowledge than I have could propose a viable alternative.
The only thing I can think of is to have the init/wait/main containers make actual API calls to get the data needed, when needed. But not sure if that makes sense.
Use Cases
When deploying workflows to so-called "serverless" node solutions that don't support downwardAPI projections (i.e. AKS virtual nodes)
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.
The text was updated successfully, but these errors were encountered: