Operations | Monitoring | ITSM | DevOps | Cloud

How Kubernetes Operators May Conflict With Resource Optimization (And How to Avoid It)

A Kubernetes Operator is a method of packaging, deploying, and managing a Kubernetes application. It extends the native Kubernetes API by combining custom resources (CRDs) with a dedicated controller: a custom control loop that continuously watches the state of those resources. The primary purpose of an operator is to automate complex, stateful applications (like databases, message queues, or monitoring suites) that require human operational knowledge to maintain.

New in Kubex: KAI Scheduler Integration for Shared GPU Inference

Today, we’re launching Kubex support for the KAI Scheduler and automated GPU sharing for inference workloads. As AI inference moves into production, platform teams are being asked to serve more models, support more teams, and control GPU costs at the same time. But many inference workloads do not need an entire GPU all the time. When teams reserve full GPUs or oversized GPU fractions to stay safe, expensive capacity can sit idle across the cluster.

The Inference Paradox: How Split-Brain LLMs Are Killing Your GPU ROI

During the Toronto KCD (Kubernetes Community Days), I attended an insightful talk on AI resource optimization that highlighted a staggering Gartner study: “AI infrastructure is adding $401 billion in new spending this year alone. Yet, real-world audits tell a much darker story, revealing that average GPU utilization in the enterprise is stuck at a dismal 5%”. While many people in the audience were shocked by that number, the data didn’t come as a surprise to us.

From Visibility to Real Savings: Turning FinOps Insights into Measurable Cost Reduction

FinOps programs are maturing, and most organizations have better visibility into cloud spend than ever before. Dashboards are full of data. And yet costs keep climbing. The problem isn’t the data. It’s the gap between knowing where the waste is and actually eliminating it. In this joint session, Tangoe and Kubex come together to bridge that gap. Tangoe brings deep expertise in spend management and FinOps discipline, while Kubex delivers infrastructure-level optimization across cloud, Kubernetes, and the AI and GPU workloads that are rapidly becoming the next frontier of cost pressure.

10 Enterprise AI Infrastructure Voices Worth Following

Enterprise AI has crossed an inflection point. The model problem is largely covered. What remains unsolved is the operational impact: how to run AI inference and agentic processes continuously, reliably, and at a cost that doesn’t cancel out the value. Most enterprises are discovering this the hard way. GPU utilization dashboards show 80%. Actual compute efficiency is half that. Token demand is compounding at 200-500% annually as agents multiply every action into dozens of model calls.