Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The spec.template.spec.terminationGracePeriodSeconds: 3600 setting has no effect #28

Open
eugen-nw opened this issue Dec 11, 2019 · 17 comments
Assignees

Comments

@eugen-nw
Copy link

eugen-nw commented Dec 11, 2019

My container runs a Windows Console application in an Azure Kubernetes instance. I'm doing the SetConsoleCtrlHandler subscription, I catch the CTRL_SHUTDOWN_EVENT (6) and Thread.Sleep(TimeSpan.FromSeconds(3600)); so the SIGKILL won't get sent to the container. The container receives indeed the CTRL_SHUTDOWN_EVENT and logs on a separate thread one message/second to show for how long it kept waiting.

I'm adding the required registry settings,

USER ContainerAdministrator
RUN reg add hklm\system\currentcontrolset\services\cexecsvc /v ProcessShutdownTimeoutSeconds /t REG_DWORD /d 3600 && \
    reg add hklm\system\currentcontrolset\control /v WaitToKillServiceTimeout /t REG_SZ /d 3600000 /f
ADD publish/ /

I verified this running the container on my computer and 'docker stop -t <seconds>' achieves the delayed shutdown.

The relevant .yaml deployment file fragment.

spec:
  replicas: 1
  selector:
    matchLabels:
      app: aks-aci-boldiq-external-solver-runner
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: aks-aci-boldiq-external-solver-runner
    spec:
      terminationGracePeriodSeconds: 3600
      containers:
      - image: ...
        imagePullPolicy: Always
        name: boldiq-external-solver-runner
        resources:
          requests:
            memory: 8G
            cpu: 1
      imagePullSecrets:
        - name: docker-registry-secret-official
      nodeName: virtual-kubelet-aci-connector-windows-windows-westus

After deployment I ran the 'kubectl get pod aks-aci-boldiq-external-solver-runner-69bf9cd949-njzz2 -o yaml' command and verified that the setting below is present in the output:

  terminationGracePeriodSeconds: 3600

If I do 'kubectl delete pod', the containers stays alive only for the default 30 seconds instead of the 1 hour that I want to get. Could the problem be in the VK or could this behavior be caused by AKS please?

@macolso macolso added the kind/bug Something isn't working label Dec 19, 2019
@ibabou
Copy link
Contributor

ibabou commented Dec 20, 2019

@eugen-nw, so this actually not supported at all today on 2 levels:
1- VK package itself up till v1.1 (the current used by azure virtual kubelet) didn’t honor that. It simply calls a delete pod on provider and sets always 30 seconds. But this got updated in v1.2, so we should be able to utilize it in future. The other thing is that our provider don’t send any updates about pod after delete call, so actually it immediately gets deleted although the 30 secs showing from K8s side. This later point is going to be fixed shortly, i’m currently working on an update for that.
2- This is main problem, ACI doesn’t support a way to configure how the termination should be handled, and what grace period to be used if specified. The delete operation is synchronous too, so actual resource is removed regardless of actual pod cleanup that gets triggered on ACI’s backend. We’re aware of the limitations on ACI, but till these are supported, the fixes mentioned in (1) won’t make a difference. @macolso the async deletion is coming with new api, but I remember you/Deep mentioning about termination handling. Can you please elaborate if it is planned for next semester ?

@eugen-nw
Copy link
Author

Thanks very much for having looked into this! When will this issue be fixed please? Our major customer is not pleased with the fact that some of their long running computations get killed midway through and need to be restarted on a different container.

@eugen-nw
Copy link
Author

@ibabou your answer 2. above implies that even if we'd use Linux containers running on virtual-node-aci-linux we'd run into the exact same problem. I assume that virtual-node-aci-linux is the equivalent Linux ACI connector. Are both of these 2 statements correct please?

@ibabou
Copy link
Contributor

ibabou commented Jan 11, 2020

@eugen-nw if you mean the graceful period and the wait on containers termination on ACI's side, yeah that's not currently supported to either Linux or Windows.

@ibabou ibabou added kind/enhancement New feature or request and removed kind/bug Something isn't working labels Jan 11, 2020
@eugen-nw
Copy link
Author

Thanks very much, that's what I was asking about. That's very bad behavior on ACI's side. Do they plan to fix it?

@ibabou
Copy link
Contributor

ibabou commented Jan 11, 2020

So our team owns both ACI service and AKS-VK integration. but I don't have an ETA about that feature. I'll let @dkkapur @macolso elaborate more.

@dkkapur
Copy link

dkkapur commented Jan 13, 2020

@eugen-nw indeed :( we're looking into fixing this in the coming months on ACI's side. Hope to have an update for you in terms of a concrete timeline shortly.

@eugen-nw
Copy link
Author

@dkkapur: THANKS VERY MUCH for planning to address this problem soon! This is a major issue for our largest customer.

We scale our processing on demand, based on workload sent to containers through a Service Bus Queue. There are two distinct types of processing: 1). under 2 minutes (the majority) 2). over 40 minutes (occurs now and then). Whenever the AKS HPA scales down, it kills the containers that it spun during scale up. If any of the long processing operations happen to land on one of those scale-up containers, it will get aborted and currently we have no way of avoiding that. We've designed the solution such as the processing will restart on another container, but our customer is definitely not happy with the fact that the 40' processing may happen to run for much longer durations on occasion.

@macolso
Copy link
Contributor

macolso commented Jan 13, 2020

Ya - I've been working on enabling graceful termination / lifecycle hooks for ACI. If you want to talk more about your use case, I'd love to set up some time - shoot me a piece of mail [email protected]

@AlexeyRaga
Copy link

Bumping into the same issue with the auto scaler.

image

4 months passed, are there any known workarounds? Or ETA for the fix?

@AlexeyRaga
Copy link

@dkkapur @macolso @ibabou Sorry for bumping it again, it hurts us quite a lot here, any news on this front?

@eugen-nw
Copy link
Author

eugen-nw commented Jul 9, 2020

Probably customer focus is no longer trendy these days? I’ll check out the AWS offerings and will report back.

@macolso
Copy link
Contributor

macolso commented Jul 9, 2020

Hi @AlexeyRaga , unfortunately no concrete ETA we can share at this point. We're happy to hop on a call and talk a bit to the product roadmap though - email shared above ^^

@asipras
Copy link

asipras commented May 3, 2021

This is a big drawback where the pods scheduled on virtual node does not support Pod Lifecycle Hooks or terminationGracePeriodSeconds. This functionality is needed to stop the pods from getting terminated during scaling-in.

Is there any timeline to implement this issue? @macolso

@rustlingwind
Copy link

rustlingwind commented Jul 15, 2021

Does the terminationGracePeriodSeconds work for aws eks pods on fargate ? Fargate nodes also looks like a kind of virtual nodes.

@dkkapur dkkapur assigned macolso and unassigned dkkapur and ibabou Jul 16, 2021
@Andycharalambous
Copy link

Any progress on this at all yet? It's over 2 years since the last update.

@helayoty
Copy link
Member

Hey @Andycharalambous , we will start working on it soon, no ETA yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants