- Daily Devops Tips
- Posts
- Why Companies Are Leaving Kubernetes
Why Companies Are Leaving Kubernetes
👋 Hi! I’m Bibin Wilson. In each edition, I share practical tips, guides, and the latest trends in DevOps and MLOps to make your day-to-day DevOps tasks more efficient. If someone forwarded this email to you, you can subscribe here to never miss out!
28% cost reduction, from $180/month per instance on Kubernetes to $130/month on EC2.
Note: You can read the web version here
When it comes to running applications today, Kubernetes is becoming the gold standard.
However, in recent times, several companies have chosen to move away from Kubernetes due to its complexity and the challenges it presents.
In today's edition, I want to share three such stories and their learnings. Learning from others' experiences is a great way to avoid huge mistakes in infrastructure management.
Looking at such learnings can help you make better design choices when working with Kubernetes.
1. Gitpod Story
Gitpod is a cloud-based development environment that provides pre-configured, ready-to-code workspaces for developers. It allows users to spin up a development environment in seconds.
Gitpod started with Kubernetes as the backbone for their cloud-based development environments.
Like many others, they believed Kubernetes’ scalability, automation, and orchestration would be perfect for handling thousands of development environments daily.
But as they grew, they ran into unexpected challenges. They struggled with the unique needs of development environments, which are highly stateful and interactive.
Here’s where things started to break:
CPU Bursts – Developers need instant processing power, but Kubernetes scheduling wasn’t fast enough, causing frustrating delays.
Memory Management – RAM is expensive, so they tried overbooking it (k8s allows it through requests and limits. But without proper swap space, Kubernetes would just kill processes when memory ran out. - OOM (Out of Memory) Killer.
Storage Performance – Fast SSDs improved performance, but they were tied to specific nodes. Persistent storage (PVCs) should have helped, but it was slow and unreliable.
Security & Isolation – Developers needed root-like access to install packages and configure their environments, but that clashed with Kubernetes strict security model.
Networking Complexity – Each environment had to be isolated for security, but also needed flexible access for developers.
So they build Gitpod Flex, a Kubernetes-inspired but simplified control plane.
It solves most issues by,
Removing Kubernetes overhead while keeping declarative APIs.
Offering better security, performance, and ease of deployment.
Supporting self-hosting in under 3 minutes with better compliance.
The key lesson: Kubernetes is great for production workloads, but not always the best fit for developer environments.
Gitpod learned this the hard way and built a leaner, more efficient alternative.
Source: GitPod Blog
2. Juspay Story
Juspay has a payment processing backend called Hyperswitch.
For Hyperswitch, Kafka plays an important role in event streaming, ensuring smooth data flow between application servers and storage.
Initially, Kubernetes was the go-to choice for container orchestration, providing a managed environment for scaling Kafka nodes.
However, as the workload grew, several unexpected challenges came up, making Kafka on Kubernetes inefficient and costly.
Following are the three pain points.
Resource Allocation Inefficiencies: Kubernetes resource management often under-provisioned resources, leading to wasted CPU and memory. At scale, it led to led to much higher cost than expected.
Auto-Scaling Struggles: Kafka is stateful, but Kubernetes auto-scaling is designed for stateless applications. This led to 15-second message processing delays and increased latency during scale-ups.
Operational Complexity with Strimzi: Managing Kafka clusters with Strimzi (operator for running Apache Kafka on k8s) became a manual, error-prone process. Adding new nodes often failed to integrate, requiring frequent interventions.
To cut costs and improve Kafka performance, Hyperswitch migrated from Kubernetes to EC2.
Here are the results.
28% cost reduction, from $180/month per instance on Kubernetes to $130/month on EC2.
Easy vertical scaling, upgrading from T-class to C-class instances without disruptions.
More control over performance, ensuring stable operations under peak loads.
Key Takeaway: Not all workloads are ideal for Kubernetes.
While it excels at stateless, highly dynamic applications, stateful systems like Kafka often require more control over resources and scaling.
Source: JusPay Blog
3. Threekit Story
Threekit is an enterprise-grade 3D visualization and augmented reality (AR) platform that enables businesses to create interactive experiences for e-commerce, retail etc.
In 2018, the Threekit team looked for a fully managed compute solution.
It required batch processing to handle large-scale rendering, data transformation, and content generation efficiently.
Kubernetes had just emerged as the industry standard, making it the clear choice at the time.
Kubernetes soon revealed the following.
High Costs: Running a cluster required redundant management nodes and over-provisioning due to slow autoscaling, leading to wasted resources.
Scaling Issues: Managing high job volumes was challenging, and solutions like Argo introduced additional complexity.
Operational Overhead: Even simple tasks required deep Kubernetes expertise, adding DevOps burden. Maintaining the cluster demanded dedicated Kubernetes engineers.
Lock-In Trap: Kubernetes clusters created dependencies that made it difficult to integrate external resources or migrate to a different setup.
While Kubernetes solved hardware management problems, it made infrastructure more complex and expensive to maintain.
To reduce complexity, they adopted Google Cloud Run.
Cloud Run scales to zero, meaning costs are based only on actual usage, unlike Kubernetes, which required paying for idle resources.
While Kubernetes scaling took minutes, Cloud Run scales up in seconds, ensuring seamless handling of traffic spikes.
Cloud Run, built on Google’s Borg, eliminates the need for Kubernetes cluster maintenance, simplifying deployments.
Cloud Run Tasks allowed up to 10,000 jobs per batch with built-in retries, removing the need for custom job scheduling infrastructure.
Key Takeaway: For enterprises managing highly dynamic workloads, Kubernetes may still be useful.
However, for companies focused on simpler, cost-efficient, and scalable solutions, solutions like Cloud Run offers a compelling alternative, eliminating infrastructure overheads while not compromising performance.
Source: Threekit Blog
Wrapping Up
For those who transitioned from legacy systems to VMs and then to Kubernetes, it is clear that Kubernetes isn’t always the best fit for every workload.
In my personal experience, we tried to host all stateful apps (databases, messaging systems, etc.) outside of Kubernetes. (Although many companies successfully run those in Kubernetes.)
Usually, problems start when you begin operating at scale, maintenance overhead, cost, etc.
Does This Mean Kubernetes Isn’t the Ideal Platform for Apps?
Not at all! Kubernetes is more popular than ever, with rapid adoption across industries, including AI/ML workloads, cloud-native applications, and large-scale microservices architectures.
However, it’s not a one-size-fits-all solution. While Kubernetes excels in many areas, there are certain use cases where it may not be the best choice.
Global Kubernetes Market size was valued at USD 1.7 billion in 2023 and is poised to grow from USD 2.11 billion in 2024 to USD 11.78 billion by 2032, growing at a CAGR of 24.0% during the forecast period (2025-2032).
Note: You can share your suggestions/feedback in the comments on the web version here.
Reply