Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

7 best AI deployment platforms for production Kubernetes workloads in 2026

Training a model in a notebook is easy. What breaks teams is the step after, serving it reliably without haemorrhaging cloud budget or burying your SREs in YAML. The common trap: picking a platform that handles the model but not the surrounding stack. An AI deployment platform should orchestrate the full application graph (inference endpoints, vector databases, caching layers, and frontends) inside a single VPC, with GPU autoscaling that doesn't require a dedicated platform engineer to babysit.

#056 - Cloud Contradictions and Cautionary Tales with Corey Quinn (The Duckbill Group)

In this episode of the Kubernetes for Humans podcast, Itiel sits down with the internet's favorite cloud contrarian, Corey Quinn of the Duckbill Group. Corey shares his unconventional career path as a "cautionary tale," explaining why his knack for fixing horrifying AWS bills makes him a terrible employee, and why he absolutely refuses to touch Kubernetes in production.

Context Engineering: How to Manage AI Context at Scale

Context engineering is the practice of managing the information an AI model sees (documents, tool outputs, memory, and structured metadata about the systems it reasons over) so it can make accurate decisions inside a real engineering organization. Most engineering teams have access to the same AI coding agents: Claude, GPT, Gemini, the major variants everyone is shipping. The model is no longer the differentiator.

What happens when you delete everything? Three minutes, or thirty hours.

Last year, at the annual conference for an open source framework you've definitely heard of, I walked up to the founder in a room outside the main stage. He was hunched over his laptop, frantic. We've known each other for a few years. "What's going on? Is everything okay?" He looked up with the specific shade of white people only get when they realize they've made a big mistake.

DORA Metrics in the AI Era: Why Deployment Isn't Faster

DORA metrics in the AI era reveal a paradox: PR volume is climbing, but deployment frequency is staying flat. In this talk, GitKraken's Director of Product Jeff Schinella breaks down why AI-accelerated code generation is creating a review bottleneck that your DORA metrics can't fully explain on their own. Jeff walks through how PR metrics (cycle time, first response time, code churn, and PR size) serve as the leading indicators behind your DORA data. If your deployment frequency is flat while PR counts go up, the bottleneck isn't your devs. It's your review capacity.

Rightsizing Nightmares: When Your Cloud Cost Tool Degrades Performance

This is what production teams see happening. A vertical pod autoscaler recommendation gets applied automatically. Resource requests come down a notch across a namespace. The cost dashboard registers a small cost savings win. A few minutes later, health checks start failing. Pods enter crash loops.

The cloud optionality blueprint: standardizing the stack to end vendor lock-in

Key takeaway: Real cloud strategy isn't about running the same workload everywhere at once; it’s about the freedom to move when you need to. By standardizing the unified configuration file, Upsun enables true cloud optionality, moving provider migration from a re-architect project to a data move project.

AI writes the code. Who delivers it safely? | Harness Blog

The question for enterprise AI in 2026 is no longer just which model. It’s which harness. An agent harness is the system around the model. It decides what the agent remembers, what context it sees, what tools it can call, what it is allowed to do, and what happens when it is wrong. The model provides intelligence. The harness provides control. This is where the real engineering is happening.