- TechOps Examples
- Posts
- Kubernetes Upgrades - How Not to Mess Up?
Kubernetes Upgrades - How Not to Mess Up?
TechOps Examples
Hey — It's Govardhana MK 👋
Along with a use case deep dive, we identify the remote job opportunities, top news, tools, and articles in the TechOps industry.
👋 Before we begin... a big thank you to today's sponsor Notops
If your team finds Kubernetes and cloud-native technology overwhelming, Notops.io enables you to leverage its full potential without deep technical knowledge of AWS, Kubernetes, or complex cloud ecosystems.
Notops.io empowers teams to build scalable, secure Kubernetes platforms with industry-best practices and observability from the start—so you can focus on innovation, not infrastructure.
IN TODAY'S EDITION
🧠 Use Case
Kubernetes Upgrades - How Not to Mess Up?
🚀 Top News
Microsoft will soon let you clone your voice for Teams meetings
Allowing real-time speech-to-speech translation in nine languages, simulating users' voices for more personal meetings. While promising for accessibility and engagement, it raises potential risks around misuse and deepfake-like scenarios.
👀 Remote Jobs
Nightwatch is hiring a DevOps Engineer
Remote Location: Worldwide
Stickermule is hiring a Site Reliability Engineer
Remote Location: Worldwide
📚️ Resources
AWS Reusable IaC AIOps Modules Collection
Reuse without affiliation for Machine Learning (ML), Foundation Models (FM), Large Language Models (LLM) and GenAI development and operations on AWS
Explaining Kubernetes To Uber Driver
A relatable dive of K’UBER’netes to an UBER driver, in a fun, casual way that turns tech jargon into something even non-techies can nod along to - and maybe even chuckle at.
How to Find Resource Hogging Processes Using the Linux CLI
A practical guide to spotting and stopping resource hogging processes on Linux, with step-by-step commands and tools to keep your system running smoothly. Perfect for techies who love solving slowdowns.
👋 A big thank you to our sponsor Synthflow
Handle your phone calls 24/7 with AI
Deploy no-code, always-on, and human-like AI Phone calls
Book appointments, transfer calls, and extract valuable info.
Easily connects with your tech stack (native integrations with HubSpot and more)
🛠️ TOOL OF THE DAY
JSON CRACK - Transform your data into interactive graphs or trees as you type.
Supports JSON, YAML, CSV, XML, TOML.
Your data is never stored on the servers.
Everything happens on your device.
🧠 USE CASE
Kubernetes Upgrades - How Not to Mess Up?
You may have heard about the Reddit Kubernetes upgrade horror story, a 314 minute outage caused by a version upgrade from 1.23 to 1.24.
Whether you're running a startup's first cluster or managing production at scale, no one is immune to upgrade challenges.
Kubernetes releases move fast, and the N-2 support policy means staying on top of upgrades is critical.
Minor version timelines can quickly leave your cluster unsupported if upgrades are delayed.
Kubernetes upgrade documentation provides details about the technical steps for upgrading.
Let’s not dive into that again here.
Instead, I’ve depicted the official upgrade process as a checklist in the illustration below for a quick reference.
How Can we do it? Don’t worry - we’ll discuss it in detail right after this!
Phase 1: PLAN
DEV → Latest Kubernetes version to catch early issues and test breaking changes.
STAGING → DEV - 1 minor version to nail compatibility and smooth the path to production.
PROD → Close to STAGING for simplified workflows and reduced upgrade risks.
YAMLs → Per environment via Kustomize to handle configuration differences.
Testing Time → 2 weeks in staging, 1 month in dev for minor versions before production rollout.
Phase 2: PREPARE
Add Kubernetes EOL and release dates to your calendar to stay on top of timelines.
Upgrade dev with a new cluster once the target version reaches patch
.2
.Keep the old dev cluster as a fallback while monitoring the new one for issues.
Upgrade staging to one minor version behind dev; a new cluster is optional.
Phase 3: ACT
Use Pluto to check for deprecated or removed API paths in configs and Helm charts.
Check Helm releases with Nova to confirm CNIs, CoreDNS, and other dependencies are compatible.
Take a snapshot of etcd with Velero to safeguard critical data and enable disaster recovery.
Monitor the upgrade progress and cluster health visually using Lens.
Follow Kubernetes upgrade documentation to upgrade the control plane and nodes sequentially.
Reboot nodes safely and automatically post-upgrade using Kured to apply OS updates.
Worthy TLDR:
Can we skip minor versions and upgrade directly to the latest Kubernetes release?
No, upgrades must follow minor versions sequentially.
How often should we upgrade Kubernetes to stay secure and supported?
Every 12–14 months to stay within the N-2 support window.
Is it better to upgrade an existing cluster or create a new one and migrate workloads?
For clusters behind by more than two versions, starting fresh is often easier.
I keep hearing: 'Can WASM replace containers?
For someone new to WASM, it’s a lightweight runtime for sandboxed code
My take: It is great for edge cases and plugins, but containers still rule app deployments.
Thoughts ?
— Govardhana Miriyala Kannaiah (@govardhana_mk)
2:19 PM • Nov 23, 2024
You may even like: