- TechOps Examples
- Posts
- How To Fix OOMKilled
How To Fix OOMKilled
Good day. It's Monday, Aug. 12, and in this issue, we're covering:
How To Fix OOMKilled
HashiCorp 2024 State of Cloud Strategy Report
Google Cloud launches BigQuery continuous queries
Adidas Reduced Kubernetes Cluster Costs by 50%
Bash Script to Install Multiple Helm Charts in One Go
Terraform Ephemeral Values Explained
You share. We listen. As always, send us feedback at [email protected]
Use Case
How To Fix OOMKilled
OOMKilled occurs in Kubernetes when a container exceeds its memory limit or tries to access unavailable resources on a node, flagged by exit code 137.
Typical OOMKilled looks like
NAME READY STATUS RESTARTS AGE
web-app-pod-1 0/1 OOMKilled 0 4m7s
"Pods must use less memory than the total available on the node; if they exceed this, Kubernetes will kill some pods to restore balance."
Learn more about OOMKilled visually here:
credit: perfectscale
How to Fix OOMKilled Kubernetes Error (Exit Code 137)
1. Identify OOMKilled Event: Run kubectl get pods
and check if the pod status shows OOMKilled
.
2. Gather Pod Details: Use kubectl describe pod [pod-name]
and review the Events section for the OOMKilled reason.
Check the Events section of the describe pod, and look for the following message:
State: Running
Started: Mon, 11 Aug 2024 19:15:00 +0200
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
...
3. Analyze Memory Usage: Check memory usage patterns to identify if the limit was exceeded due to a spike or consistent high usage.
4. Adjust Memory Settings: Increase memory limits in pod specs if necessary, or debug and fix any memory leaks in the application.
5. Prevent Overcommitment: Ensure memory requests do not exceed node capacity by adjusting pod resource requests and limits.
Point worth noting:
"If a pod is terminate due to a memory issue. it doesn’t necessarily mean it will be removed from the node. If the node’s restart policy is set to ‘Always’, the pod will attempt to restart"
To check the QoS class of a pod, run this command:
kubectl get pod -o jsonpath='{.status.qosClass}'
To inspect the oom_score
of a pod:
1. Run kubectl exec -it /bin/bash
2. To see the oom_score
, run cat/proc//oom_score
3. To see the oom_score_adj
, run cat/proc//oom_score_adj
The pod with the lowest oom_score
is the first to be terminated when the node experiences memory exhaustion.
Tool Of The Day
An open-source antivirus engine for detecting trojans, viruses, malware & other malicious threats.
Trends & Updates
The report reveals only 8% of organizations are highly cloud-mature, gaining stronger security and faster development. Key findings include 91% reporting cloud waste, 64% facing a skills shortage, and 79% using or planning multi-cloud deployments.
This new feature allows instant data processing, integrates with Google’s AI tools for real-time machine learning, and simplifies event-driven architectures, all within BigQuery.
Resources & Tutorials
By automating resource scaling with Karpenter, enhancing efficiency with Kyverno-driven VPAs, and strategically downsizing during non-peak hours, all while tackling challenges like node underutilization and balancing application performance.
This article introduces "helmister," a lightweight bash script that automates the installation and uninstallation of multiple Helm charts, ideal for secure, air-gapped environments where simplicity and accessibility are key.
Explore how ephemeral values in Terraform can minimize secret sprawl and enhance security by limiting the persistence of sensitive data, offering a fresh perspective on managing secrets and other transient resources effectively.
"The only limit to our realization of tomorrow is our doubts of today."
- Franklin D. Roosevel
Interested in reaching smart techies?
Our newsletter puts your products and services in front of the right people - engineering leaders and senior engineers - who make important tech decisions and big purchases.
Did someone forward this email to you? Sign up here