Operations | Monitoring | ITSM | DevOps | Cloud

5 pitfalls to avoid when measuring DevEx in the AI era

Developer experience, commonly known as DevEx, describes how an organization’s systems, workflows, tools, and culture affect developer productivity. A positive DevEx leads to tangible organizational benefits, including faster releases, increased innovation, and reduced technical debt. Measuring DevEx enables engineering management to quantify their team’s impact and understand where to direct improvement efforts.

Debug and evaluate your AI app from your coding agent with Datadog Agent Observability

Coding agents like Claude Code, Cursor, and Codex CLI handle the coding parts of building an AI application well. The harder work comes after: understanding why a response went wrong, building eval sets that reflect real production behavior, and keeping up with an application that changes faster than any one-off script can. Teams spend 60–80% of their time on evaluation and error analysis, and much of that work needs to be redone every time the stack shifts.

Building an AI Ready Data Backbone: Dima Kan at AICamp 2026

The Aiven Platform is more than a collection of open source services for streaming, storing and analyzing data. The platform ensures that all services run reliably and securely in the clouds of your choice, are observable, and can easily be integrated with each other and with external 3rd party tools.

The Journey to Achieving Hyperscale Availability with AI-Driven Prediction

At hyperscale, a regional cloud outage is not merely a technical disruption—for Samsung Account, which serves 2.1 billion users across three global regions, it is an immediate global service crisis. Fragmented, region-siloed monitoring creates blind spots that make early detection nearly impossible, leaving SRE teams perpetually reactive rather than predictive. The path to proactive reliability requires both a philosophical shift and a foundational change in how observability data is collected, unified, and reasoned over.
Sponsored Post

From Dashboards to Conversational AI: The Evolution of UI in IT Products

The way IT teams interact with technology has changed dramatically over the years. From early text-based interfaces to today's dashboards and now conversational AI, each stage has reshaped how we monitor, diagnose, and understand complex IT environments. But while dashboards gave us visibility, they often led to more questions than answers. In this post, we briefly explore the evolution of UI in IT products and how conversational AI is bridging the gap between data and understanding.

Which Bugs AI Agents Fix Better With Traffic

In the first experiment, I wanted a baseline: if an AI coding agent gets the same production signal a human would get, can it fix bugs in a codebase it has never seen? Yes, but only when I gave it better context. With only an alert, the agent passed 51% of the runtime tests. When I added captured traffic, the actual request and response for the failing call, it climbed to 77%. This post is the second pass.

Instrumenting AI Agents for the Agent Timeline: A Practical OpenTelemetry Guide

AI agents are nondeterministic, multi-step, and opaque. When one fails in production, "the model said something weird" is the cheapest, most useless line in your incident postmortem. To debug agents the way they actually run, you need telemetry that captures all of it, in order, with enough context to reconstruct what happened. The OpenTelemetry GenAI Semantic Conventions give you a vendor-neutral way to do exactly that.

Why Observability Isn't Enough for AI Coding Agents

Observability platforms collect pre-instrumented logs, metrics, and distributed traces to monitor production systems and surface failures to human engineers. The adoption of AI into engineering has led observability providers to offer those same signals to agents. This is often packaged as AI observability, but the signals themselves were designed around a human investigation loop. AI coding agents work faster, consume data differently, and need feedback as they work rather than after deployment.