The Ops Community ⚙️

Cover image for How do you gracefully shut down Pods in Kubernetes?
Daniele Polencic
Daniele Polencic

Posted on

How do you gracefully shut down Pods in Kubernetes?

When you type kubectl delete pod, the pod is deleted, and the endpoint controller removes its IP address and port (endpoint) from the Services and etcd.

You can observe this with kubectl describe service.

Listing endpoints with kubectl describe

But that's not enough!

Several components sync a local list of endpoints:

  • kube-proxy keeps a local list of endpoints to write iptables rules.
  • CoreDNS uses the endpoint to reconfigure the DNS entries.

And the same is true for the Ingress controller, Istio, etc.

Endpoints propagation in Kubernetes

All those components will (eventually) remove the previous endpoint so that no traffic can ever reach it again.

At the same time, the kubelet is also notified of the change and deletes the pod.

What happens when the kubelet deletes the pod before the rest of the components?

Endpoints are not propagated and removed at the same time

Unfortunately, you will experience downtime because components such as kube-proxy, CoreDNS, the ingress controller, etc., still use that IP address to route traffic.

So what can you do?

Wait!

The kubelet will immediately delete the pod, even if the endpoint is not propagated

If you wait long enough before deleting the pod, the in-flight traffic can still resolve, and the new traffic can be assigned to other pods.

How are you supposed to wait?

The kubelet should wait for the endpoints to propagate before deleting the pod

When the kubelet deletes a pod, it goes through the following steps:

  • Triggers the preStop hook (if any).
  • Sends the SIGTERM.
  • Sends the SIGKILL signal (after 30 seconds).

The kubelet deleting the pod goes through 3 steps: preStop hook, SIGTERM and SIGKILL

You can use the preStop hook to insert an artificial delay.

You can use a preStop hook to delay deleting a pod

You can listen to the SIGTERM signal in your app and wait.

Also, you can gracefully stop the process and exit when you are done waiting.

Kubernetes gives you 30s to do so (configurable).

You can catch the SIGTERM signal in your app and wait

Should you wait 10 seconds, 20 or 30s?

There's no single answer.

While propagating endpoints could only take a few seconds, Kubernetes doesn't guarantee any timing nor that all of the components will complete it at the same time.

Endpoint propagation timeline in Kubernetes

If you want to explore more, here are a few links:

And finally, if you've enjoyed this thread, you might also like:

Oldest comments (0)