Operations | Monitoring | ITSM | DevOps | Cloud

Bits AI Dev Agent: Automatically identify issues and generate code fixes

The Bits Dev Agent is an AI-powered coding assistant in Datadog designed to reclaim developer productivity by autonomously monitoring telemetry data, identifying key issues, and generating production-ready pull requests. Developers receive asynchronous, context-rich PRs with clear explanations, allowing them to shift their focus from troubleshooting to reviewing solutions and building better code.

Introducing Bits AI SRE, your AI on-call teammate

Bits AI SRE is your AI on-call teammate, built to autonomously investigate alerts and coordinate incident response. Integrated with Datadog, Slack, GitHub, Confluence, and more, Bits analyzes telemetry, reads documentation, and reviews recent deployments to determine the root cause of alerts—often before you’ve even opened your laptop. In fact, if you're using Datadog On-Call, you can view Bits’s findings right from your phone—so you’re always one step ahead, no matter where you are.

Datadog Incident Response: Unify remediation and communication

With Datadog's new AI voice agent in Incident Response, you can quickly get up to speed on the issue and start taking action directly from your phone. Handoff notifications make it easy to jump straight to the relevant context and quickly communicate with other responders. Finally, our status pages enable you to automatically update users on your remediation progress.

Here's how to add business data to logs from retail endpoints | Datadog Tips & Tricks

Some sources simply do not generate data-rich logs. Retail endpoints that are older or run on proprietary services, for example, very often produce logs without the kinds of data that are needed to perform useful business analytics. So, what can you do?

Beyond Metrics: How We Reimagined Incident Response with RUM

When your monitoring tools and logs tell you everything's fine, but users can't access critical healthcare services, where do you look? Our team discovered that Real User Monitoring (RUM) isn't just for tracking page load times and user journeys – it's a powerful incident response tool that can uncover issues traditional monitoring misses entirely.

Built for Engineers: Datadog's Vision for the Future

Datadog was built by engineers, for engineers. At, Datadog Co-founder & CEO Olivier Pomel opened the keynote with a clear message: observability, security and AI are converging. From infrastructure to AI Agents, the future of engineering requires one unified platform. Catch all product announcements to see what’s next in observability and security on our Youtube channel!

Stay Compliant: Meet Your Audit Needs with Datadog!

Datadog's internal compliance team has built audit workflows and control monitoring capabilities using the Datadog platform. We actively use these capabilities to scale our audit programs and comply with multiple compliance frameworks. This session will go into the details of how we addressed our compliance use-cases using the Datadog platform and how our customers can get started.

How Cursor scaled infrastructure rapidly and reliably using Datadog

At Datadog, we use Cursor to empower our teams to build more quickly. And we know that building and troubleshooting with AI tools like Cursor is done best with the right observability data and context. Discover how Cursor was able to rapidly and reliably scale their infrastructure 100x using Datadog to meet the needs of a fast growing user base. And learn more about how we’re bring Datadog tools and context to your favorite AI IDEs and agents with our MCP Server and extensions.

AI-Augmented Control Plane: Scaling IT Operations with Intelligent Automation

How do you enable a team of 100 engineers to effectively support 300+ critical applications across five hosting platforms? At Thomson Reuters, we turned to AI - not as a buzzword, but as a genuine force multiplier. Experience our journey of transforming traditional IT operations into an AI-augmented powerhouse, where Datadog, ServiceNow, and custom AI solutions work in harmony to create a next-generation control plane. We'll share real victories, honest challenges, and practical insights from our mission to build a more intelligent operational framework.

LLM Observability for Reliability and Stability: A Monitoring Strategy for Phone Communication

LLM APIs offer groundbreaking potential, but also present challenges such as response latency, hallucinations, and service instability. In Japan, where telephone communication remains crucial for business, these issues present significant barriers to the introduction of LLM-based applications. Despite being a relatively young startup, we have developed and deployed an LLM-based telephone service with over 40 million calls.