DevZero Launches Its Autonomous Compute and Inference Optimization Platform

Today, DevZero launched its new platform for autonomous compute and inference optimization. Since last year, we've been developing a system that profiles, schedules, and rightsizes Kubernetes workloads with zero restarts. It solves for uptime anxiety and runaway compute costs. We've now proven this platform with some great customers including DataBahn, Starburst Data, Dentira, OpenObserve, and Outerbounds.

As CEO of DevZero, I want to give you a personal take on this news. (For a dry, just-the-details take, see our press release.)

We founded DevZero in 2022 as a cloud development platform designed to improve coder productivity. In 2025, we began to pivot because the infrastructure tech we built to deliver our product proved more valuable than the original product.

Let me explain.

To help developers build and test software in the cloud, rather than on slow personal computers, we took care of their infrastructure needs. We found ourselves using Kubernetes to provide cloud-based dev environments. We were floored at how inefficient Kubernetes was and how much it ate into our margins. If we were overprovisioning CPUs, memory, and GPUs by north of 50%, what about everyone else?

We looked for solutions to Kubernetes overspending. Conventional autoscalers had numerous flaws that limited the cost-savings and, more crucially, hurt reliability and performance. One flaw stood out: autoscalers couldn't live migrate workloads without restarting compute resources. That's a big problem, especially in AI. The last thing you want is to restart an LLM training run that has cost you over $1 million a day.

Cutting the cloud bill isn't worth it if the result is downtime or a botched LLM training run. Restarts made companies, ours included, hesitant to trust in autoscalers.

We began developing a system to optimize our Kubernetes infrastructure. It enabled us to provide dev environments more cost-efficiently, without tradeoffs in reliability or performance. Different from autoscalers that worked at the node or pod level, we figured out how to rightsize workloads. We also figured out how to snapshot workloads and live migrate them to new compute resources instantly, without restarts.

Meanwhile, it dawned on us: coder productivity was becoming less of a concern thanks to LLMs and coding agents. Our autonomous infrastructure optimization tools addressed two growing problems: uptime anxiety and runaway compute costs.

As you may have noticed, we rolled out our new platform quietly. We wanted to prove it thoroughly, with customers, before announcing anything official. Done and done.

To sum it up, we've built an autonomous compute and optimization platform that, we're confident, can outcompete the incumbents on infrastructure reliability, performance, and cost-savings.

How does it work?#

Our profiler continuously monitors clusters, nodes, and individual workloads to build statistical models of resource demand.
Our context-aware scheduling and autoscaling layer places workloads efficiently and provisions cost-effective capacity using real-time data across 3,000+ instance types, 69K+ price points, 23 GPU models, and 80+ regions spanning AWS, Azure, GCP, OCI, and OpenShift.
We then rightsize workloads in real time, adjusting CPU, memory, and GPU provisioning to reality.
When the unexpected happens, our checkpoint-restore enables instant live migration without restarts.

For an outside perspective on what we've built, check out Mike Vizard's article in Techstrong: DevZero Launches Automation Platform to Dynamically Rightsize Kubernetes Clusters.

DevZero Launches Its Autonomous Compute and Inference Optimization Platform

How does it work?#

Related Posts

DevZero is a Resilience Tool in an Optimizer's Clothing

How Kubernetes Waste Becomes AI Budget

AI Didn't Break K8s Economics. It Exposed Them.

Cut Kubernetes Cost Before You Pay a Cent.

Start Free