The Ops Community ⚙️

Cover image for Memory requests and limits in Kubernetes
Daniele Polencic
Daniele Polencic

Posted on • Updated on

Memory requests and limits in Kubernetes

In Kubernetes, what should I use as memory requests and limits?

And what happens when you don't set them?

Let's dive into it.

In Kubernetes, you have two ways to specify how much memory a pod can use:

  1. "Requests" are usually used to determine the average consumption.
  2. "Limits" set the max number of resources allowed.

The Kubernetes scheduler uses requests to determine where the pod should be allocated in the cluster.

Since the scheduler doesn't know the consumption (the pod hasn't started yet), it needs a hint.

The Kubernetes scheduler works best with requests

The kubelet uses limits to stop the process when it uses more memory than is allowed.

It's worth noting that the process could spike in memory usage before it's terminated.

The kubelet terminates the container when it goes over the memory limit

The kubelet is also in charge of monitoring the total memory utilization of the node.

If memory is running low, the kubelet evicts low-priority pods.

But how does it decide what's low priority?

The kubelet evict pods if the node is running low on resources

When Kubernetes creates a Pod, it assigns one of these QoS classes to the Pod:

  1. Guaranteed
  2. Burstable
  3. BestEffort

Pods that are "Guaranteed" have CPU and memory requests and limits and are least likely to face eviction.

Also, memory request = memory limit AND CPU request = CPU limit.

This class is best suited for stateful applications like databases.

Guaranteed Quality of Service for Pods

Pods with a "Burstable" class have memory and CPU requests but not limits.

This allows the Pods to flexibly increase their resources if available (but they could also use any amount of resources).

burstable Quality of Service for Pods

A Pod is "BestEffort" only if none of its containers has a memory or CPU limit or request.

Those Pods are the first to be evicted in the event of Node resource pressure.

Burstable Quality of Service for Pods

Most of your pods are likely to be "Burstable" (i.e. requests, but fewer limits), and a very selected few should be "Guaranteed".

Burstable pods are good because they use resources dynamically and are cheaper.

With Burstable pods you can dymically allocate resources as the container needs them

With Guaranteed pods, you allocate all resources up to the limit upfront, which could result in more expensive (but safer) deployments.

With Guaranteed pods, resources are allocated upfront and can't be freed even if the process isn't using them

BestEffort pods are generally something you should avoid.

The Kubernetes scheduler doesn't know how much memory or CPU the process needs, so it could end up scheduling an impractical amount of pods in the existing nodes.

You can fit as many BestEffort pods in a node as you wish

But if you stick only to Burstable pods, how does the kubelet know which pod to evict first?

Pods can have PriorityClass that indicates the importance of a Pod relative to other Pods.

Pod PriorityClass

The scheduler also leverages the Pod PriorityClass to evict pods when the cluster is full.

For example, if you have low-priority batch jobs (e.g. reports), you could assign a low priority, and they will be evicted first.

Pods with lower priority are evicted to make space for higher priority pods

How should you choose the memory and request of a pod?

A simple way is to calculate the smallest memory unit as:

Enter fullscreen mode Exit fullscreen mode

For a 4GB node and a limit of 10 Pods, that's a 400Mb request.

Assign the smallest unit or a multiplier to your containers.

Assigning requests for your pods

A better approach is to monitor the app and derive the memory utilization.

You can do this with your existing monitoring infrastructure or use the Vertical Pod Autoscaler to monitor and report the average request value.

Measuring memory consumption with the Vertical Pod Autoscaler

How should I set the limits?

Limits trigger eviction, so you should definitely set a value lower than the available memory.

Here's a handy calculator for that.

Also, if you want to dig in more a few relevant links:

And finally, if you've enjoyed this thread, you might also like:

Top comments (0)