How Adidas Cut Kubernetes Costs by 50%

TechOps Examples

Hey — It's Govardhana MK 👋

Along with a use case deep dive, we identify the top news, tools, videos, and articles in the TechOps industry.

IN TODAY'S EDITION

🧠 Use Case

  • How Adidas Cut Kubernetes Costs by 50%

🚀 Top News

📽️ Videos

📚️ Resources

🛠️ TOOL OF THE DAY

prel -  An application that temporarily assigns Google Cloud IAM Roles and includes an approval process.

🧠 USE CASE

How Adidas Cut Kubernetes Costs by 50%

Kubernetes eating upto 50% of cloud costs is no longer an assumption. Infact, the last CNCF Cloud Native FinOps Microsurvey captures this reality.

Ref: CNCF

More and more organizations are taking concrete actions to reduce infra costs, and Adidas is no different.

Before looking into how they cut down Kubernetes costs, let’s understand the critical elements adapted by them:

Karpenter - a Kubernetes autoscaler that dynamically provisions and scales resources, optimizing EC2 usage with spot instances to reduce costs.

  • Provisions compute based on real-time pod needs

  • Launches only necessary instance types, consolidates workloads

  • Removes underutilized nodes, swaps expensive instances

Ref: Karpenter Architecture

Kyverno - a Kubernetes policy engine that automates and enforces resource configurations for compliance across clusters.

  • Enforces policies via validating and mutating requests

  • Targets resources by type, name, labels

  • Provides Policy Reports for compliance insights

Ref: Kyverno architecture

Adidas First Approach: Get Cheaper EC2 instances

To reduce EC2 costs, they implemented Karpenter, that adjusted node counts based on application demand.

Karpenter performs consolidation by selecting the most suitable instance types and sizes to maximize node efficiency, removed underutilized nodes, and shifted workloads to smaller, cost-effective instances whenever possible.

Another leveraged key feature was Karpenter’s use of spot instances - AWS’s lower-cost, unused compute capacity.

Karpenter identified spot instances with the lowest price and minimal interruption risk.

Adidas Second Approach: Creating VPAs automatically

They improved resource utilization by automating Vertical Pod Autoscalers (VPAs) for workloads in development and staging clusters.

This included optimizing container requests and limits, and adjusting replica counts when applications were idle.

Typically used for application security, Kyverno was already part of their setup.

Although using a security tool for VPA creation might seem unconventional, it proved highly effective in this context.

Kyverno automatically generated VPAs for each new Deployment, StatefulSet, or DaemonSet by checking:

  1. If the resource already had an HPA or VPA.

  2. If VPA creation was allowed for that resource and namespace via a specific label.

When both conditions were met, Kyverno created a VPA for the resource.

Configuring VPAs without app-specific details required balancing cost savings with stability.

VPAs can adjust either requests only (for cost savings) or both requests and limits (for stability during spikes).

They set minAllowed low (10 millicores CPU, 32 MB memory) for scaling down, while maxAllowed had three options:

  • Set to original requests for lower cost but limited flexibility.

  • Set to original limits to allow full scaling but increase costs.

  • Set to high values to maximize resources but risk higher costs.

For multi-container apps, maxAllowed was left unspecified due to complexity.

Result:

Default VPAs saved 30% in CPU and memory across development and staging, with some limitations:

  • VPAs don’t work with HPAs without custom metrics.

  • VPAs have issues with older Java apps due to heap size and memory reporting.

  • Disruption-sensitive apps may face VPA restart issues; Initial mode can help, though default is Auto.

Teams can label resources to opt out of VPA creation.

CPU and memory usage after the VPA creation on a big staging cluster (ref: https://medium.com/adidoescode/reducing-cloud-costs-of-kubernetes-clusters-c8c1e3bdb669)

 

Adidas Third Approach: Scaling down in non office hours

To reduce compute hours, costs, and CO₂ footprint, they decreased app replicas during non-office hours using kube-downscaler to scale applications based on a set schedule using annotations, like:

annotations:
  downscaler/downtime-replicas: "1"
  downscaler/uptime: Mon-Fri 08:00-19:00 Europe/Berlin

In this example, the app scales down to 1 replica during nights (7 pm to 8 am) and on weekends.

By default, apps scale to 1 replica, with options to set 0 replicas, adjust timing, or opt out. For apps with an HPA, scaling to 0 replicas requires annotation on the Deployment or StatefulSet instead of the HPA.

Adidas Fourth Approach: Scaling based on external metrics

Resource metrics may not fully capture app load, and HPAs can’t scale to 0 replicas as 0 pod apps don’t generate metrics.

To address this, they used KEDA (Kubernetes Event-driven Autoscaling), which scales apps using external metrics from sources like Prometheus and Kafka, enabling scaling to 0 replicas with independent metrics (e.g., Kafka consumer lag).

Custom metrics also allow HPA and VPA to work simultaneously, supporting vertical scaling on resource metrics and horizontal scaling on external metrics.

Adidas Final Challenge: Underutilized nodes due to restrictive PDBs

They found many half-empty nodes remained due to restrictive Pod Disruption Budgets (PDBs) preventing Karpenter from removing underutilized nodes.

To solve this, they created a Kyverno policy to ensure:

  • minAvailable is below 100%.

  • maxUnavailable is above 0%.

  • Apps have more than one replica.

  • HPAs have minReplicas above 1.

A cleanup policy also runs twice daily to remove undetected problematic PDBs, though it’s advised not to run this during cluster upgrades.

And they cut down Kubernetes costs by 50% in totality.

I hope this was an eventful and inspiring use case that everyone can adapt.

You may even like:

Looking to promote your company, product, service, or event to 16,000+ TechOps Professionals? Let's work together.