Kubernetes OOMKilled and ImagePullBackOff Simplified

TechOps Examples

Hey β€” It's Govardhana MK πŸ‘‹

Along with a use case deep dive, we identify the remote job opportunities, top news, tools, and articles in the TechOps industry.

πŸ‘‹ Before we begin... a big thank you to today's sponsor PERFECTSCALE

[Automated K8s Optimization Platform] πŸ”₯ 50% Kubernetes cost reduction - without risk or manual effort

PerfectScale by DoiT autonomously optimizes Kubernetes in real time, ensuring you only pay for what you actually need - while improving performance and stability. No manual effort, no risky guesswork, no code changes.

  • Up to 50% cost savings

  • 92% fewer resiliency issues

  • 43% of DevOps time regained

Start optimizing Kubernetes today - without the headaches. Try now!

IN TODAY'S EDITION

🧠 Use Case
  • Kubernetes OOMKilled and ImagePullBackOff Simplified

πŸš€ Top News

πŸ‘€ Remote Jobs
  • Chess.com is hiring a Senior SRE

    Remote Location: Worldwide

πŸ“šοΈ Resources

πŸ“’ Reddit Threads

πŸ› οΈ TOOL OF THE DAY

Docmost - An open source collaborative wiki and documentation software.

An open-source alternative to Confluence and Notion.

🧠 USE CASE

Kubernetes OOMKilled and ImagePullBackOff Simplified

Today's use case is about OOMKilled and ImagePullBackOff.

Let's take a quick look at these two errors, as they are among the most common issues in Kubernetes.

We’ll break down why they happen, how to troubleshoot them effectively.

OOMKilled

OOMKilled occurs in Kubernetes when a container exceeds its memory limit or tries to access unavailable resources on a node, flagged by exit code 137.

Pods must use less memory than the total available on the node; if they exceed this, Kubernetes will kill some pods to restore balance.

Learn more about OOMKilled visually here:

How to Fix OOMKilled Kubernetes Error (Exit Code 137)

1. Identify OOMKilled Event: Run kubectl get pods and check if the pod status shows OOMKilled.

2. Gather Pod Details: Use kubectl describe pod [pod-name] and review the Events section for the OOMKilled reason.

Check the Events section of the describe pod, and look for the following message:


State:          Running
Started:      Tue, 25 Feb 2025 19:15:00 +0200
Last State:   Terminated
Reason:       OOMKilled
Exit Code:    137
...

3. Analyze Memory Usage: Check memory usage patterns to identify if the limit was exceeded due to a spike or consistent high usage.

4. Adjust Memory Settings: Increase memory limits in pod specs if necessary, or debug and fix any memory leaks in the application.

5. Prevent Overcommitment: Ensure memory requests do not exceed node capacity by adjusting pod resource requests and limits.

Point worth noting:

If a pod is terminate due to a memory issue. it doesn’t necessarily mean it will be removed from the node.

If the node’s restart policy is set to β€˜Always’, the pod will attempt to restart

To check the QoS class of a pod, run this command:

kubectl get pod -o jsonpath='{.status.qosClass}'

To inspect the oom_score of a pod:

1. Run kubectl exec -it /bin/bash

2. To see the oom_score, run cat/proc//oom_score

3. To see the oom_score_adj, run cat/proc//oom_score_adj

The pod with the lowest oom_score is the first to be terminated when the node experiences memory exhaustion.

ImagePullBackOff

ImagePullBackOff error happens when the system fails to retrieve the required container image. When this occurs, the container remains in a Waiting state, unable to proceed with deployment.

What exactly happens during an ImagePullBackOff?

Fixing ImagePullBackOff

1. Fix Manifest or Image Name

Correct typos or incorrect image names in the pod manifest.

Apply the corrected manifest to update the pod.

$ kubectl apply -f pod.yaml
pod/techops-pod configured

2. Non existent Image Validation

Ensure the image is available in the registry before pulling.

$ docker push techopsexamples.com/api-service:v2.5

3. Image Registry Authorization

Create a secret with credentials to access the private registry.

$ kubectl create secret docker-registry reg-secret
--docker-server=private-repo.com
--docker-username=myuser
--docker-password=mysecurepassword
[email protected]

Link the secret to the pod's manifest to allow access.

apiVersion: v1
kind: Pod
metadata:
  name: private-api-pod
  labels:
    app: private-api
spec:
  containers:
  - name: private-api-container
    image: private-repo.com/internal-app:v3.0
  imagePullSecrets:
  - name: reg-secret

Next time you encounter an OOMKilled or ImagePullBackOff error, you'll know exactly how to diagnose and fix it efficiently.

PerfectScale by DoiT autonomously optimizes Kubernetes in real time, ensuring you only pay for what you actually need - while improving performance and stability.

No manual effort, no risky guesswork, no code changes.

I run a DevOps and Cloud consulting agency and have helped 17+ businesses, including Stanford, Hearst Corporation, CloudTruth, and more.

When your business needs my services, book a free 1:1 business consultation.

You may even like:

Looking to promote your company, product, service, or event to 40,000+ Cloud Native Professionals? Let's work together.