The best practice for Kubernetes memory limits is to set them equal to requests. This is the opposite of CPU limits, where the recommendation is to not use limits at all. Let me explain why with a pizza party analogy.
The danger of setting memory limits higher than requests
Imagine a pizza party where each guest requests 2 slices but is allowed to eat up to 4. You order pizza based on the requests — 2 slices per person. Everything is fine until a few hungry guests eat their full 4 slices.
Suddenly there's not enough pizza. A huge bouncer appears and shouts "OUT OF PIZZA KILLER (OOP KILL)!" — and throws a random guest out of the party. That guest didn't do anything wrong — they were within their limit — but someone had to go because total consumption exceeded total supply.
This is exactly what happens with Kubernetes OOM Kills. When pods try to access memory that isn't physically available, the kernel's Out Of Memory Killer terminates a pod — often not even the one causing the problem.
Avoiding Kubernetes OOM Kills
The solution is simple: set memory limits equal to memory requests. Going back to our pizza analogy — if each guest can only eat exactly what they ordered, no one runs out of pizza.
When limits equal requests, a pod that uses too much memory only crashes itself — it doesn't cause cascading failures across other pods. Errors surface earlier and their impact is isolated.
The difference between memory limits and CPU limits on Kubernetes
It's important to remember that memory is fundamentally different from CPU. Memory is a non-compressible resource. You can't just slow down memory access if you run out — you crash. CPU is compressible. If you run out of CPU, everything just runs slower (throttling).
This leads to two different best practices:
- For CPU: No limits (let pods burst and slow down if needed)
- For Memory: Limits equal to requests (guarantee isolation and prevent random OOM kills)
Right-sizing your Kubernetes memory requests
Setting limits equal to requests only works if your requests are accurate. If you request too little, your pods will crash constantly. If you request too much, you waste money.
To find the right values, use KRR (Kubernetes Resource Recommender). It analyzes your actual usage history from Prometheus and tells you exactly how much memory each pod needs to run safely without wasting resources.

Natan Yellin, CEO — Natan has been writing software for over 15 years. He regularly posts on LinkedIn.
