Monthly Archive

Turn feedback into action across your engineering org with Datadog Forms

Nov 26, 2025 By Barak Shoushan In Datadog

Engineering teams rely on forms for everything from approvals to checklists, yet the process usually lives outside engineering operations. Spreadsheets, one-off surveys, and external form builders capture inputs, but they create scattered data, slow follow-ups, and manual translation into actionable work. Datadog Forms enables teams to create and share interactive forms directly within Datadog.

Read Post

Datadog

Read more about Turn feedback into action across your engineering org with Datadog Forms

Define, run, and scale custom LLM-as-a-judge evaluations in Datadog

Nov 25, 2025 By Rashel Hoover In Datadog

Teams deploying LLM applications face a critical blind spot: They can measure speed and cost, but not whether their AI is actually giving good answers. To build user trust in these applications, teams also need to measure response quality, including factual accuracy, safety, and tone. Operational metrics show how a system behaves, but not whether its responses are correct or on brand.

Read Post

Datadog

Read more about Define, run, and scale custom LLM-as-a-judge evaluations in Datadog

Introducing Bits AI SRE, your AI on-call teammate

Nov 24, 2025 By Datadog In Datadog

Bits AI SRE is your AI on-call teammate, built to autonomously investigate alerts and coordinate incident response. Integrated with Datadog, Slack, GitHub, Confluence, and more, Bits analyzes telemetry, reads documentation, and reviews recent deployments to determine the root cause of alerts—often before you’ve even opened your laptop. In fact, if you're using Datadog On-Call, you can view Bits’s findings right from your phone—so you’re always one step ahead, no matter where you are.

View Video

Datadog

Read more about Introducing Bits AI SRE, your AI on-call teammate

Build custom apps in seconds with conversational AI in App Builder

Nov 24, 2025 By Barak Shoushan In Datadog

Datadog App Builder is a low-code tool for creating internal apps, making use of a drag-and-drop interface that allows engineering teams to troubleshoot issues, optimize operations, and enable self-service while connecting directly to their Datadog data and permissions. Now, with conversational AI, teams can go from idea to working prototype even faster.

Read Post

Datadog

Read more about Build custom apps in seconds with conversational AI in App Builder

Data Observability: Build confidence in the data life cycle

Nov 21, 2025 By Datadog In Datadog

Datadog Data Observability provides a complete solution with quality checks (e.g., volume, row changes, freshness), custom SQL-based monitors, anomaly detection, column-level lineage across systems like Snowflake and Tableau, full pipeline visibility, and targeted alerts when data issues arise.

View Video

Datadog

Read more about Data Observability: Build confidence in the data life cycle

Coordinate large-scale engineering initiatives with IDP Campaigns

Nov 21, 2025 By Amrita Lakhanpal In Datadog

As organizations grow, engineering leaders often need to drive cross-team initiatives such as reducing cloud spend, upgrading runtimes, or strengthening security controls. Tracking this work can quickly become fragmented across spreadsheets, dashboards, and status meetings. Progress is hard to measure, accountability is unclear, and the impact of each effort can be difficult to demonstrate.

Read Post

Datadog

Read more about Coordinate large-scale engineering initiatives with IDP Campaigns

Use OpenTelemetry with Observability Pipelines for vendor-neutral log collection and cost control

Nov 20, 2025 By Micah Kim In Datadog

Today, many DevOps and security teams operate in a world of complex, hybrid, or multi-vendor environments. As more teams look to avoid lock-in by adopting open standards, OpenTelemetry (OTel) is quickly gaining adoption as the primary open source method for DevOps and security teams to instrument and aggregate their telemetry data. However, OTel alone may lack the advanced processing functions, native volume control rules, and hybrid environment support that large organizations need.

Read Post

Datadog

Read more about Use OpenTelemetry with Observability Pipelines for vendor-neutral log collection and cost control

How Datadog Feature Flags is resilient to cloud provider failures

Nov 19, 2025 By Anthony Rindone In Datadog

As major incidents like AWS’s October 2025 outage illustrate, modern systems are immensely interconnected. A failure in one can lead to a cascade of downstream problems. In this case, issues with DNS resolution for DynamoDB led to widespread disruptions with other AWS services and, subsequently, thousands of applications and services that rely on that infrastructure.

Read Post

Datadog

Read more about How Datadog Feature Flags is resilient to cloud provider failures

Explore Cloud Instance Pricing and Performance with Datadog Instance Explorer

Nov 19, 2025 By Datadog In Datadog

Meet Datadog Instance Explorer — a way to explore, compare, and monitor cloud instance pricing and performance across AWS, Azure, and Google Cloud in one place. In this quick overview, you’ll learn how to: Start exploring your instance options today and make smarter, data-driven infrastructure decisions.

View Video

Datadog

Read more about Explore Cloud Instance Pricing and Performance with Datadog Instance Explorer

Optimizing Ruby performance: Observations from thousands of real-world services

Nov 18, 2025 By Ivo Anjo In Datadog

Over the past three decades, Ruby has assumed a pivotal role in the modern web stack and become a fixture in the tool kits of countless DevOps and platform teams. Today, it is a driving force in contemporary application development, testing, automation, and CI/CD. For this blog post, we used data from our always-on continuous profiling of more than 3,000 real-world services from hundreds of organizations to track trends in Ruby usage and performance.

Read Post

Datadog

Read more about Optimizing Ruby performance: Observations from thousands of real-world services

Introducing Datadog Agent Builder: Build agentic workflows for alert response and remediation

Nov 18, 2025 By Amber Tunnell In Datadog

Building automated workflows that adapt to real-world complexity can be a challenge. As systems scale and scenarios multiply, teams often end up hardcoding endless logic branches just to handle every potential outcome. That’s why we’re introducing Datadog Agent Builder, a powerful new tool that lets you create custom AI agents that are fully hosted by Datadog.

Read Post

Datadog

Read more about Introducing Datadog Agent Builder: Build agentic workflows for alert response and remediation

Datadog GPU Monitoring: Optimize and troubleshoot AI infrastructure

Nov 18, 2025 By Datadog In Datadog

With Datadog GPU Monitoring, engineering and ML teams can monitor GPU fleet health across cloud, on-prem, and GPU-as-a-Service platforms like Coreweave and Lambda Labs. Real-time insights into allocation, utilization, and failure patterns make it easy to spot bottlenecks, eliminate idle GPU spend, and resolve provisioning gaps. By tying usage metrics directly to cost and surfacing hardware and networking issues impacting performance, Datadog helps teams make fast, cost-efficient decisions to keep AI workloads running reliably at scale.

View Video

Datadog

Read more about Datadog GPU Monitoring: Optimize and troubleshoot AI infrastructure

Bringing Observability to Data

Nov 14, 2025 By Datadog In Datadog

While observability practices have evolved in recent years, they have largely focused on application services and infrastructure. Yet it is data what powers our applications, businesses, and AI models. When data issues occur, the consequences can be far reaching, from poor product experiences to billing errors to misinformed AI outcomes. In this session, Jonathan Morin, Group Product Manager at Datadog, shares real-world examples of incidents and explains how data observability can address them, helping teams detect issues earlier, reduce costly downtime, and restore trust in their data.

View Video

Datadog

Read more about Bringing Observability to Data

The Hidden Bottleneck in Latency: GetYourGuide's Database Performance Journey

Nov 14, 2025 By Datadog In Datadog

Fast front-end and back-end code alone won’t guarantee low end-to-end latency as hidden bottlenecks in the database can undermine even the best engineering efforts. In this session, Oleksii Serhiienko, Senior Site Reliability Engineer at GetYourGuide, will share how his team put database performance at the center of their monitoring strategy. He will highlight how they identified and fixed slow queries, uncovered load balancing issues that drove significant cost savings, and built monitoring practices that improved both reliability and investigation workflows.

View Video

Datadog

Read more about The Hidden Bottleneck in Latency: GetYourGuide's Database Performance Journey

Use Grok parsing to extract fields from logs | Datadog Tips & Tricks

Nov 12, 2025 By Datadog In Datadog

When your logs don’t follow a standard format, it can be difficult to extract valuable information, like key-value pairs and nested JSON objects. Grok parsing lets you define flexible patterns that match unstructured log data so you can extract specific fields to query, filter, and visualize. In this video, you’ll learn how to: By refining your Grok parsers, you can make your logs more useful for analytics, dashboards, or alerts, and get even more value from your logs.

View Video

Datadog

Read more about Use Grok parsing to extract fields from logs | Datadog Tips & Tricks

Sync your Backstage catalog with Datadog IDP

Nov 11, 2025 By Mark Avery In Datadog

Backstage is a popular open source framework for building internal developer portals (IDPs) used by organizations to aggregate service metadata and create a single source of truth for their software developers. However, data stored in the Backstage Software Catalog can quickly become siloed and inaccessible from monitoring tools such as Datadog.

Read Post

Datadog

Read more about Sync your Backstage catalog with Datadog IDP

Eliminate unnecessary costs in your Amazon S3 buckets with Datadog Storage Management

Nov 10, 2025 By Mahashree Rajendran In Datadog

Cloud object storage powers a wide range of workloads, from AI training datasets to customer-facing media libraries. As your data grows into the petabyte scale, managing storage costs and ensuring reliability requires fine-grained visibility. You need answers to questions like: Which specific teams, services, workloads, or datasets are driving spend? Which data is cold and should be archived? What fixes will have the biggest impact on cost and performance?

Read Post

Datadog

Read more about Eliminate unnecessary costs in your Amazon S3 buckets with Datadog Storage Management

Observability and FedRAMP in Action: The VA's Mission to Deliver Reliable Digital Service

Nov 10, 2025 By Greg Reeder In Datadog

Ensuring digital services remain accessible, reliable, and secure is a high priority for any organization operating at scale. For the Department of Veterans Affairs (VA), this focus is central to its mission of providing quality care to veterans, their families, and caregivers. Often described as “the largest IT shop in the United States,” the VA manages 2.7 million pieces of equipment across a vast network of interconnected systems.

Read Post

Datadog

Read more about Observability and FedRAMP in Action: The VA's Mission to Deliver Reliable Digital Service

How feedback loops power progressive software delivery

Nov 10, 2025 By Candace Shamieh In Datadog

Modern engineering teams face competing priorities. Developers are expected to deliver new features faster than ever, but users expect rock-solid reliability with every release. Shipping quickly can feel like you’re gambling with user trust. If you move too fast, you risk outages, but if you move too slowly, innovation stalls.

Read Post

Datadog

Read more about How feedback loops power progressive software delivery

Import Snowflake, Salesforce, ServiceNow, and Databricks metadata into Datadog with Reference Tables

Nov 6, 2025 By Jinwu Liu In Datadog

Engineering, operations, and security teams can struggle to make sense of their telemetry data in isolation. Logs, metrics, and events tell what is happening but are often missing critical metadata like who owns what, where it's coming from, or indicators of attack. These gaps in visibility slow down incident response, complicate cost control, and make business or security analytics much harder.

Read Post

Datadog

Read more about Import Snowflake, Salesforce, ServiceNow, and Databricks metadata into Datadog with Reference Tables

Catch and remediate ECS issues faster with default monitors and the ECS Explorer

Nov 6, 2025 By Sumedha Mehta In Datadog

Organizations that run applications on Amazon Elastic Container Service (Amazon ECS) often juggle signals across container and task metrics, logs, and events while they hunt for the change or condition that broke a deployment. This work adds operational overhead and extends incident timelines as teams switch between tools and manually correlate symptoms.

Read Post

Datadog

Read more about Catch and remediate ECS issues faster with default monitors and the ECS Explorer

Key learnings from the State of Containers and Serverless report

Nov 6, 2025 By James Eastham In Datadog

We recently released the 2025 State of Containers and Serverless report, which examines cloud usage data from tens of thousands of Datadog customers. The study shows adoption trends across container orchestration platforms and serverless offerings, and it explores how organizations use those resources to optimize workloads for efficiency, cost, and simplicity.

Read Post

Datadog

Read more about Key learnings from the State of Containers and Serverless report

Bits AI SRE, Flex Frozen, and GPU Monitoring | DASH 2025

Nov 6, 2025 By Datadog In Datadog

Get a first look at Datadog’s biggest product reveals from DASH 2025. Meet Bits AI SRE, your 24/7 autonomous AI Site Reliability Engineer, Flex Frozen for up to 7 years of managed log retention, and GPU Monitoring for full visibility into your AI workloads. Experience the future of observability in action.

View Video

Datadog

Read more about Bits AI SRE, Flex Frozen, and GPU Monitoring | DASH 2025

Turn fragmented runtime signals into coherent attack stories with Datadog Workload Protection

Nov 5, 2025 By Guillaume Fournier In Datadog

Security teams face a constant trade-off between detection coverage and alert fatigue. Broad, rule-based detection approaches surface every possible indicator of compromise (IoC) but generate unmanageable alert volumes. Narrow, tightly scoped rules reduce noise but risk missing critical signals. And while individual indicators of compromise can highlight suspicious behavior, they often lack the surrounding context needed to tell a complete story of how an attack unfolded.

Read Post

Datadog

Read more about Turn fragmented runtime signals into coherent attack stories with Datadog Workload Protection

This Month in Datadog - October 2025

Nov 5, 2025 By Datadog In Datadog

In October’s episode of This Month in Datadog, Jeremy shows how you can use AI to query with natural language in DDSQL Editor, ensure teams stay continually updated during outages, mitigate the impact of flaky tests, and ingest OpenTelemetry Protocol (OTLP) metrics from serverless and third-party SaaS environments.

Read Post

Datadog

Read more about This Month in Datadog - October 2025

Triaging an Incident with a Critical Data Pipeline at #rivian

Nov 5, 2025 By Datadog In Datadog

Rivian makes electric vehicles to advance its mission to keep the world adventurous forever. As software defined vehicles, Rivian’s R1T and R1S are connected to the cloud from day 1, and telemetry data is at the heart of enabling mobile notifications, remote diagnostics, fleet management, and more. With so many critical pipelines in the cloud, observability is a top priority for the data platform.

View Video

Datadog

Read more about Triaging an Incident with a Critical Data Pipeline at #rivian

Safely Roll Out Features with Datadog Feature Flags

Nov 4, 2025 By Datadog In Datadog

In this short demo, see how Datadog Feature Flags help teams release new functionality safely and efficiently. Datadog provides advanced targeting, progressive rollouts, and automatic rollbacks — all integrated with powerful observability data. Learn how you can use simple on–off flags or multi-variant configurations to test and deploy features with confidence. With built-in monitoring of key guardrail metrics, Datadog can automatically pause or reverse rollouts when issues are detected, keeping your releases stable.

View Video

Datadog

Read more about Safely Roll Out Features with Datadog Feature Flags

Building Smarter AI Products #Datadog #DASH #AI

Nov 4, 2025 By Datadog In Datadog

AI capabilities are advancing faster than ever — transforming how teams design, build, and ship intelligent products. In this teaser from Building Successful AI-powered Products at Datadog DASH, experts discuss the rise of agent-based systems, evolving model capabilities, and how to stay ahead in the new era of automation.

View Video

Datadog

Read more about Building Smarter AI Products #Datadog #DASH #AI

How Datadog is Reinventing On-Call #Datadog #OnCall #DevOps

Nov 4, 2025 By Datadog In Datadog

Datadog is reimagining how engineers handle incidents—moving beyond simple alerts to an intelligent, voice-driven on-call experience. With Datadog On-Call, teams can acknowledge alerts, access runbooks, post to Slack, and collaborate in real time, all before even touching their computer. See how Datadog brings incident response, communication, and automation together so you can respond faster and keep customers informed.

View Video

Datadog

Read more about How Datadog is Reinventing On-Call #Datadog #OnCall #DevOps

Understand user experience through network performance with Datadog Synthetic Monitoring

Nov 3, 2025 By Lauren Zuniga In Datadog

When an application slows down or fails, pinpointing the cause isn’t always simple. Is it a backend regression, a misbehaving API, or a bottleneck somewhere deep in the network? Without full visibility, teams waste precious time troubleshooting across disconnected tools and layers. Datadog Synthetic Monitoring now supports Network Path to help you proactively identify whether user-facing issues stem from your code or from the underlying network.

Read Post

Datadog

Read more about Understand user experience through network performance with Datadog Synthetic Monitoring

Accelerate your Azure integration setup with guided onboarding

Nov 3, 2025 By Michael Cronk In Datadog

Getting started with monitoring for Microsoft Azure environments can be a lengthy and manual process. Many tools require users to create app registrations, assign permissions, and enable log forwarding or telemetry data collection across multiple portals and scripts. These fragmented steps slow down onboarding and introduce opportunities for misconfiguration, making it harder for teams to quickly achieve full visibility.

Read Post

Datadog

Read more about Accelerate your Azure integration setup with guided onboarding

Monitor OCI spend, AI in DDSQL Editor, OTLP Metrics API, and more | This Month in Datadog

Nov 3, 2025 By Datadog In Datadog

See how you can gain insights into cloud costs by tracking OCI spend and easily comparing instance types in October’s episode of This Month in Datadog. Join us for a spotlight of Cloud Cost Management’s support for Oracle Cloud Infrastructure, and the product’s new feature, Instance Explorer, which enables you to visualize and easily compare the cost and performance of instances across AWS, Azure, and Google Cloud.

View Video

Datadog

Read more about Monitor OCI spend, AI in DDSQL Editor, OTLP Metrics API, and more | This Month in Datadog

How to Use Nested Queries in Datadog for Advanced Metrics Analysis

Nov 3, 2025 By Datadog In Datadog

Discover how nested queries in Datadog empower you to perform deeper, multilayered metrics analysis. In this video, Colten from the Metrics team walks through how to reuse query results to.

View Video

Datadog

Read more about How to Use Nested Queries in Datadog for Advanced Metrics Analysis

Operations | Monitoring | ITSM | DevOps | Cloud

Turn feedback into action across your engineering org with Datadog Forms

Define, run, and scale custom LLM-as-a-judge evaluations in Datadog

Introducing Bits AI SRE, your AI on-call teammate

Build custom apps in seconds with conversational AI in App Builder

Data Observability: Build confidence in the data life cycle

Coordinate large-scale engineering initiatives with IDP Campaigns

Use OpenTelemetry with Observability Pipelines for vendor-neutral log collection and cost control

How Datadog Feature Flags is resilient to cloud provider failures

Explore Cloud Instance Pricing and Performance with Datadog Instance Explorer

Optimizing Ruby performance: Observations from thousands of real-world services

Introducing Datadog Agent Builder: Build agentic workflows for alert response and remediation

Datadog GPU Monitoring: Optimize and troubleshoot AI infrastructure

Bringing Observability to Data

The Hidden Bottleneck in Latency: GetYourGuide's Database Performance Journey

Use Grok parsing to extract fields from logs | Datadog Tips & Tricks

Sync your Backstage catalog with Datadog IDP

Eliminate unnecessary costs in your Amazon S3 buckets with Datadog Storage Management

Observability and FedRAMP in Action: The VA's Mission to Deliver Reliable Digital Service

How feedback loops power progressive software delivery

Import Snowflake, Salesforce, ServiceNow, and Databricks metadata into Datadog with Reference Tables

Catch and remediate ECS issues faster with default monitors and the ECS Explorer

Key learnings from the State of Containers and Serverless report

Bits AI SRE, Flex Frozen, and GPU Monitoring | DASH 2025

Turn fragmented runtime signals into coherent attack stories with Datadog Workload Protection

This Month in Datadog - October 2025

Triaging an Incident with a Critical Data Pipeline at #rivian

Safely Roll Out Features with Datadog Feature Flags

Building Smarter AI Products #Datadog #DASH #AI

How Datadog is Reinventing On-Call #Datadog #OnCall #DevOps

Understand user experience through network performance with Datadog Synthetic Monitoring

Accelerate your Azure integration setup with guided onboarding

Monitor OCI spend, AI in DDSQL Editor, OTLP Metrics API, and more | This Month in Datadog

How to Use Nested Queries in Datadog for Advanced Metrics Analysis

Monthly Archive

Follow Us