Operations | Monitoring | ITSM | DevOps | Cloud

Evals are just tests, so why aren't engineers writing them?

You’ve shipped an AI feature. Prompts are tuned, models wired up, everything looks solid in local testing. But in production, things fall apart—responses are inconsistent, quality drops, weird edge cases appear out of nowhere. You set up evals to improve quality and consistency. You use Langfuse, Braintrust, Promptfoo—whatever fits. You start running your evals, tracking regressions, fixing issues, and confidence goes up as a result. Things improve.

Top 5 Kubernetes Network Issues You Can Catch Early with Calico Whisker

Kubernetes networking is deceptively simple on the surface, until it breaks, silently leaks data, or opens the door to a full-cluster compromise. As modern workloads become more distributed and ephemeral, traditional logging and metrics just can’t keep up with the complexity of cloud-native traffic flows.

What Is a BadUSB? Understand the Threat and How to Prevent It

Lurking beneath the convenience and everyday nature of USB devices is a sophisticated cybersecurity threat known as BadUSB. BadUSB is a type of attack that leverages the reprogrammable firmware in USB devices (e.g., flash drives, keyboards, charging cables) to carry out malicious actions. Unlike traditional malware, which lives in the file system and can often be detected by antivirus tools, BadUSB lives in the firmware layer.

Incident IQ integration is here!

We’re excited to launch one of our most highly requested integrations: StatusGator now connects directly with Incident IQ. This powerful new integration bridges the gap between real-time service monitoring and your internal support workflow. Now, whenever someone reports an outage on your public StatusGator page, a ticket is automatically created in Incident IQ—ensuring your IT team can respond quickly and efficiently.

From Alert to Answer in Seconds: Accelerating Incident Response in Dynatrace

It is 12PM and you just start eating lunch when your phone starts buzzing. A storm of different monitoring and system-level alerts start stacking up on your phone and slack. The incident response "war room" opens and downtime communications are being drafted to customers. Your team is under pressure to find the root cause, but you are immediately hit with roadblocks.

Building an Effective Post-Mortem Culture: A Step-by-Step Guide

Post-mortems are the cornerstone of continuous improvement in incident management. When done right, they transform failures into learning opportunities and prevent future outages. Yet many teams struggle to build a culture where post-mortems are valued rather than feared.

DevAIOps: A Call To Action For The Heroes Among Us

The year is 2025, and I’ve been watching teams discover what happens when you give developers AI superpowers without giving them AI super-governance. It’s like the merchandising scene from Spaceballs: “Vibe Coding: The Flamethrower. The kids love this one.” But here’s the thing: I’m not here to take away the flamethrowers. I’m here to hand out fire extinguishers and maybe suggest we practice in a safe room instead of the living room.

Getting started with the relaxAI API: Sovereign, cost-effective AI

Earlier this year, we launched relaxAI, an AI assistant designed with one paramount focus: your privacy. We’re now excited to announce the relaxAI API is in General Availability (GA) offering an OpenAI interface. This gives UK organizations up to 90% cost savings versus leading providers while ensuring data never leaves UK jurisdiction.