-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Functions being created w/ default TerminationGracePeriodSeconds #853
Comments
Please can you provide an actual example of commands we can run here? Let's say the Go sleep function from the docs for instance? Alex |
Can you demonstrate with logs and output of how your function gets scaled down and doesn't complete? I.e. with the above sleep function. |
/add label: support,question |
I'll see if I can put together a quick repro repo |
Also, while digging a little it looks like this was brought up previously. #637 |
@alexellis https://github.com/kevin-lindsay-1/openfaas-repro-gracePeriod Please let me know if you have any issues starting anything or reproducing. |
These steps are not showing an error or problem affecting you, they are just describing a configuration.
You also mention:
This is not what happens in OpenFaaS, we need to use the specified logic of the system i.e. scaling to fewer replicas.
We need to see a specific error i.e. console output, experienced as you would in production after running the scale down command. It would help if you provided: a sample function, a command for invoking it, a command to scale down to fewer replicas, and output showing that the function failed to complete. |
Ok, I have used said That said, I agree that I should have provided the exact command I used in order to delete the pod. However, a small facet where I disagree with that exact phrasing is the implication (which I am not putting words in your mouth for) that administrators need to use the commands that OF uses internally. I disagree with that specifically because OF uses K8S' API, which is designed to gracefully terminate in a standard way across pods. That is to say that graceful termination without overrides using k8s' API is just graceful termination, and that the onus is on OF to be predictable when used in k8s, and to be predictable to k8s administrators, rather than the other way around. For example, if an administrator or a node drain is stupid enough to forcefully Again, I should have provided the exact command I used, which probably would have prevented the request to use OF's internally-used k8s commands.
in shell A:
in shell B:
pod logs:
shell A:
pod is killed at based on this, I think there actually might be two issues here:
|
Having had a more detailed call with Kevin, we reproduced the problem using the tutorial from the OpenFaaS docs on extended timeouts. 2 min limit set on every core component Scaling the pod down caused the watchdog to stop accepting connections, and print a message that it was going to wait for 2 minutes. After 30 seconds, the watchdog was killed, short of its 2 minute grace period. We then set a termination grace period on the Deployment of the function, and the watchdog was then able to complete its work. There was a side effect, where the watchdog then remained idle for the remainder of the grace period. In another example, perhaps the grace period would be 1 hour, but the function completed within 1 minute, we'd then have a pod in a termination status for 59 minutes. The watchdog could be updated to track active invocations, and then be able to exit immediately after all work had been drained. https://github.com/openfaas/of-watchdog/blob/master/main.go#L110 To Kevin's point on adding new configurations for timeouts, we have plenty already. The watchdog uses the faas-netes/pkg/handlers/deploy.go Line 124 in fbf979f
Kevin mentioned the PR from the Istio team: istio/istio#35059 |
Prior to this change the watchdog would attempt to wait for the write_timeout value twice, even if there were no connections in flight. This adds a new metric for in-flight connections and removes the explicit wait. Testing with killall of-watchdog -s SIGTERM performs as expected, and clients that already connected complete their work before the termination. This change is a prerequisite for a change in faas-netes openfaas/faas-netes#853 Signed-off-by: Alex Ellis (OpenFaaS Ltd) <[email protected]>
Prior to this change the watchdog would attempt to wait for the write_timeout value twice, even if there were no connections in flight. This adds a new metric for in-flight connections and removes the explicit wait. Testing with killall of-watchdog -s SIGTERM performs as expected, and clients that already connected complete their work before the termination. This change is a prerequisite for a change in faas-netes openfaas/faas-netes#853 Signed-off-by: Alex Ellis (OpenFaaS Ltd) <[email protected]>
Prior to this change the watchdog would attempt to wait for the write_timeout value twice, even if there were no connections in flight. This adds a new metric for in-flight connections and removes the explicit wait. Testing with killall of-watchdog -s SIGTERM performs as expected, and clients that already connected complete their work before the termination. This change is a prerequisite for a change in faas-netes openfaas/faas-netes#853 Signed-off-by: Alex Ellis (OpenFaaS Ltd) <[email protected]>
Prior to this change the watchdog would attempt to wait for the write_timeout value twice, even if there were no connections in flight. This adds a new metric for in-flight connections and removes the explicit wait. Testing with killall of-watchdog -s SIGTERM performs as expected, and clients that already connected complete their work before the termination. This change is a prerequisite for a change in faas-netes openfaas/faas-netes#853 Signed-off-by: Alex Ellis (OpenFaaS Ltd) <[email protected]>
Prior to this change the watchdog would attempt to wait for the write_timeout value twice, even if there were no connections in flight. This adds a new metric for in-flight connections and removes the explicit wait. Testing with killall of-watchdog -s SIGTERM performs as expected, and clients that already connected complete their work before the termination. This change is a prerequisite for a change in faas-netes openfaas/faas-netes#853 Signed-off-by: Alex Ellis (OpenFaaS Ltd) <[email protected]>
I think that having the termination grace period match the write timeout makes a lot of sense, maybe with a small bump of 50 or 100ms so that there is no possibility of those competing with each other. But it seems reasonable that the longest a function could possibly do meaningful work is the write timeout. |
To clarify:
I've completed 2) which does not fix this issue, but makes it more tolerable if/when it's released. |
I have tested this behavior using the I do notice, however, that when not set, it uses the value of In my test, I tested timeout values of Maybe the healthcheck interval should be smaller, like In addition, it may be useful to perform one check outside of that tick interval, as soon as the shutdown process begins. This way, if you already have 0 active connections, the pod doesn't wait a tick, and instead immediately shuts down. |
Thanks for sharing your thoughts Kevin. Let me think about how this could be integrated into the implementation. Did you also see the work drain and complete as expected? |
I just tested that a pod being given 10 sleep requests all at the same time via a simple "batch" function, each with a duration of I invoked all of them in parallel with a I also tested that batching those invocations twice also worked (first batch, scale 0, scale 1, wait for pod to be ready, second batch), in that none of the executions for the second batch were sent to the pod that was shutting down. |
To follow up on my previous comment regarding the default healthcheck interval, I would argue that for most functions, In other words, the healthcheck probably uses |
When using the function CRD, pods that are created have the default
terminationGracePeriodSeconds
of30
. This means that if I have a function that needs to run for longer than 30 seconds, the pod may be forcefully killed before being allowed to complete its work.I do not know if this behavior is exclusive to functions created with the
Function
CRD.Expected Behaviour
When a function pod is terminating it should be gracefully shut down and exit after the
exec_timeout
.Current Behaviour
Deployments generated by the
Function
CRD are created with the defaultTerminationGracePeriodSeconds
(30).This causes pods with timeouts >
30s
to beSIGKILL
ed while they're still trying to gracefully exit.Are you a GitHub Sponsor (Yes/No?)
Check at: https://github.com/sponsors/openfaas
List All Possible Solutions and Workarounds
Overtly making this configurable, or computed via one of the environment variables inside the yaml, such as
exec_timeout
.Then actually just use the config, because I don't see any reference to
TerminationGracePeriodSeconds
when doing a search all for this repo.Which Solution Do You Recommend?
Currently there's already a number of timeout config variables to keep track of. This might as well just be another one.
If there's a great way to combine all the existing timeouts, that'd be great, but otherwise just being able to manually specify the grace period termination seconds would be fine.
Steps to Reproduce (for bugs)
Function
CRDterminationGracePeriodSeconds
Alternatively:
sleep
function with a duration of2m
. Invoke it once.SIGKILL
from kubernetes30s
after entering theterminating
state.Context
Pods are being forcefully killed before finishing their work, which effectively prevents autoscaling from being able to be enabled, because there is no guarantee that a pod that is created will have time to do its job.
Your Environment
FaaS-CLI version ( Full output from:
faas-cli version
):0.13.13
Docker version
docker version
(e.g. Docker 17.0.05 ):20.10.8
What version and distriubtion of Kubernetes are you using?
kubectl version
server v1.21.3
client v1.22.2
Operating System and version (e.g. Linux, Windows, MacOS):
MacOS
Link to your project or a code example to reproduce issue:
What network driver are you using and what CIDR? i.e. Weave net / Flannel
The text was updated successfully, but these errors were encountered: