-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ungraceful pod termination #1987
Comments
What's the terminationGracePeriod seconds that you see assigned to the deployment pod? If that value is too low, Karpenter may not have the time that it needs to handle the SIGTERM before it gets a SIGKILL. |
Can you also provide the command that you used to force this behavior? Just trying this out, I don't observe Karpenter going into an error state or hanging its process when it gets the SIGTERM signal. |
/triage needs-information |
Thanks Jonathan for looking into this. The
As for the command, i use It's hard to observe as the pod goes away immediately, if there was a way to shell in we might be able to tell what's going on at the container level. But to track that we have a tool that listens to kubernetes pod events, and in the case of Karpenter it receives several pod termination errors. The tool itself then creates a K8s Event to report that, and that's how we see that problem. Here is a screenshot of our tool (private repo) that explains what it does: Also the source code to track the error is like that, that will explain how it picks that up:
In the case of Karpenter, |
Description
Observed Behavior:
I can observe that the Karpenter pod exits ungracefully when it is terminated.
Kubelet tells Karpenter to shutdown, and immediately the pod is killed. The go code does not seem to handle sigterm signal and exit gracefully.
Expected Behavior:
Karpenter pod should handle sigterm signal and exit with 0 as container exit code.
Also, if you check the pod's status for
containerStatuses
, you will see that the terminated.reason is not "Completed" but "Error" which indicates a non graceful termination.Reproduction Steps (Please include YAML):
Restarting Karpenter deployment exhibits that behaviour. Logging the pod will show an immediate termination.
Also, if you check the pod's status for
containerStatuses
, the terminated.reason should say "Completed" or the container exit code should return 0.Versions:
kubectl version
):The text was updated successfully, but these errors were encountered: