Monthly Archive

Datadog acquires Adaptive ML

Jun 30, 2026 By Alexis Lê-Quôc In Datadog

Off-the-shelf models are easy to deploy, but they are rarely enough to solve complex, domain-specific challenges in production. The key to sustained AI value is not in the models themselves but in the ability to tune, evaluate, and refine those models against your organization’s real-time signals. We are excited to announce that Adaptive ML is joining Datadog to accelerate this vision by combining our deep observability data with their expertise in building specialized, high-performance AI agents.

Read Post

Datadog

Read more about Datadog acquires Adaptive ML

5 pitfalls to avoid when measuring DevEx in the AI era

Jun 30, 2026 By Datadog In Datadog

Developer experience, commonly known as DevEx, describes how an organization’s systems, workflows, tools, and culture affect developer productivity. A positive DevEx leads to tangible organizational benefits, including faster releases, increased innovation, and reduced technical debt. Measuring DevEx enables engineering management to quantify their team’s impact and understand where to direct improvement efforts.

Read Post

Datadog

Read more about 5 pitfalls to avoid when measuring DevEx in the AI era

Debug and evaluate your AI app from your coding agent with Datadog Agent Observability

Jun 30, 2026 By Michael Bevilacqua-Linn In Datadog

Coding agents like Claude Code, Cursor, and Codex CLI handle the coding parts of building an AI application well. The harder work comes after: understanding why a response went wrong, building eval sets that reflect real production behavior, and keeping up with an application that changes faster than any one-off script can. Teams spend 60–80% of their time on evaluation and error analysis, and much of that work needs to be redone every time the stack shifts.

Read Post

Datadog

Read more about Debug and evaluate your AI app from your coding agent with Datadog Agent Observability

The Journey to Achieving Hyperscale Availability with AI-Driven Prediction

Jun 30, 2026 By Datadog In Datadog

At hyperscale, a regional cloud outage is not merely a technical disruption—for Samsung Account, which serves 2.1 billion users across three global regions, it is an immediate global service crisis. Fragmented, region-siloed monitoring creates blind spots that make early detection nearly impossible, leaving SRE teams perpetually reactive rather than predictive. The path to proactive reliability requires both a philosophical shift and a foundational change in how observability data is collected, unified, and reasoned over.

View Video

Datadog

Read more about The Journey to Achieving Hyperscale Availability with AI-Driven Prediction

Women In Tech Panel - Engineering with AI in the Stack

Jun 26, 2026 By Datadog In Datadog

Every team is doing something with AI right now. What that something is, is an entirely different question. And whether that something is successful? Most teams are still figuring it out as they go.

View Video

Datadog

Read more about Women In Tech Panel - Engineering with AI in the Stack

The AI Engineering Playbook: How to Evaluate & Iterate at Every Phase of Development

Jun 26, 2026 By Datadog In Datadog

AI coding tools are accelerating development velocity, creating a release challenge most teams aren’t equipped for. Without controlled rollout, higher change velocity makes it harder to know which specific release drove the results you’re seeing in production. And when teams use AI, to build AI – LLM apps and AI agents– complexity multiplies. Traditional observability can’t ensure AI agent quality, performance, and cost-efficiency at production scale.

View Video

Datadog

Read more about The AI Engineering Playbook: How to Evaluate & Iterate at Every Phase of Development

From Legacy to AI-Ops: Securing and Scaling Systems for 20M Device Requests with Datadog

Jun 26, 2026 By Datadog In Datadog

Modernizing a legacy system serving 20 million devices without users noticing is like replacing a jet engine mid-flight. In this session, YoungJin Jung and Donggen Hong from LG U+ share their 18-month journey transforming a Telco-scale API Gateway from a rigid, proprietary solution into a high-performance, open-source architecture on AWS, and the operational challenges they solved along the way.

View Video

Datadog

Read more about From Legacy to AI-Ops: Securing and Scaling Systems for 20M Device Requests with Datadog

Ship Reliable AI Faster: How to Operate AI Agents with Control and Confidence

Jun 26, 2026 By Datadog In Datadog

Replace "AI shipped on hope" with an operating model that holds up once real users depend on it. AI quality is multi-dimensional, covering accuracy, tone, safety, and faithfulness to user data, and can't be debugged from outputs alone. Without visibility into what their AI actually did in production, teams miss regressions, reverse-engineer chains by hand, and watch a single bad answer erode trust built over hundreds of right ones.

View Video

Datadog

Read more about Ship Reliable AI Faster: How to Operate AI Agents with Control and Confidence

Reduce CDN log costs with searchable archives

Jun 26, 2026 By Rufina Mariam In Datadog

Engineering teams that manage high-volume log sources, such as content delivery network (CDN) edges, streaming platforms, and authentication systems, often have to make a difficult retention tradeoff. Indexing every event keeps logs searchable during investigations, audits, and postmortems, but it can make long-term retention expensive.

Read Post

Datadog

Read more about Reduce CDN log costs with searchable archives

How we saved over $3 million in idle compute costs with Datadog Kubernetes Autoscaling

Jun 25, 2026 By Jacob Simonov In Datadog

At Datadog, our broad Kubernetes footprint amplifies the significance of a familiar autoscaling tradeoff: Overprovisioning wastes cloud spend, while underprovisioning threatens reliability. We built Datadog Kubernetes Autoscaling (DKA) to help teams rightsize their workloads by generating intelligent resource recommendations and automating multidimensional workload scaling. Across Datadog, adopting DKA has eliminated more than $3 million in annualized idle compute costs while reducing reliability risks.

Read Post

Datadog

Read more about How we saved over $3 million in idle compute costs with Datadog Kubernetes Autoscaling

How to migrate feature flags without breaking production

Jun 23, 2026 By Anthony Rindone In Datadog

Feature flag migrations have a reputation problem. Ask anybody who’s been through one before and you’ll hear the stories, usually from someone still a little frustrated about a bad cutover, with a postmortem or two to show for it. The reputation is mostly undeserved. While the risks are real, they’re well understood and easily controlled. Getting a migration right doesn’t require a big coordinated effort.

Read Post

Datadog

Read more about How to migrate feature flags without breaking production

Using Evaluation Frameworks with Agent Observability

Jun 22, 2026 By Jennifer Mickel In Datadog

AI teams have invested heavily in evaluation frameworks, yet getting those frameworks beyond local experimentation remains challenging. Teams using open source libraries like DeepEval and Pydantic Evals gain flexibility and research-grounded metrics, but operationalizing those evaluations still requires brittle custom integration code that doesn’t scale.

Read Post

Datadog

Read more about Using Evaluation Frameworks with Agent Observability

How Coding Agents are Changing the Traditional Software Development Lifecycle

Jun 22, 2026 By Datadog In Datadog

AI coding assistants are rapidly evolving from passive copilots into active, agentic collaborators capable of planning, executing, and iterating on complex software tasks. This shift has huge ramifications onthe software development lifecycle (SDLC), developer productivity, and even the structure of engineering teams.

View Video

Datadog

Read more about How Coding Agents are Changing the Traditional Software Development Lifecycle

Fireside Chat with Datadog CPO Yanbing Li and Vercel CPO Tom Occhino

Jun 22, 2026 By Datadog In Datadog

The way we build, ship, and run software is being reshaped by AI. In this fireside chat, Yanbing Li (CPO, Datadog) and Tom Occhino (CPO, Vercel) will discuss their perspectives on the impact AI is having across the industry and what it means for teams navigating this shift today.

View Video

Datadog

Read more about Fireside Chat with Datadog CPO Yanbing Li and Vercel CPO Tom Occhino

The New Shape of Engineering

Jun 22, 2026 By Datadog In Datadog

AI’s ability to write code made huge strides over the past year. Today, coding agents aren’t just assisting developers; they are winning the "coding race" by orders of magnitude and fundamentally changing the way engineers work.

View Video

Datadog

Read more about The New Shape of Engineering

Progressing AI Beyond Scaling and Into Deep Reasoning

Jun 22, 2026 By Datadog In Datadog

The breakthroughs in AI today aren’t just coming from bigger datasets and more compute; Reinforcement Learning (RL) has quietly become one of the most powerful forces in modern AI development. RL is teaching models to reason and self-correct, enabling capabilities that make AGI feel less like science fiction and more like an inevitable future.

View Video

Datadog

Read more about Progressing AI Beyond Scaling and Into Deep Reasoning

Datadog Data Observability: Be the first to know when data fails

Jun 17, 2026 By Datadog In Datadog

Bad data doesn't announce itself. Datadog Data Observability gives you unified visibility across your entire data stack—from source systems and pipelines to dashboards and AI applications—so you catch silent failures before they cascade. Detect data quality and pipeline issues before stakeholders do, pinpoint root causes with end-to-end lineage, and reduce pipeline costs with job, cluster, and query recommendations.

View Video

Datadog

Read more about Datadog Data Observability: Be the first to know when data fails

DASH 2026 Keynote

Jun 10, 2026 By Datadog In Datadog

At, Datadog launched 100+ capabilities to help customers drive autonomy and manage growing AI and security complexity. From new Bits AI, log management, and security capabilities, customers have the visibility and autonomous operations they need to detect, investigate and resolve issues across the development loop and data lifecycle. Tune in to the full keynote to catch the highlights.

View Video

Datadog

Read more about DASH 2026 Keynote

Store and search high-volume logs with ClickHouse and Datadog

Jun 10, 2026 By Andy Lihani In Datadog

As teams scale AI and agentic workloads, log volumes can grow fast. That growth can force teams into a difficult trade-off: Keep logs searchable in their existing workflows, or store them cost-effectively for longer periods. For teams that rely on logs during incident response, compliance reviews, and long-running investigations, losing either affordability or searchability can slow down troubleshooting. Datadog and ClickHouse are partnering to help remove that trade-off.

Read Post

Datadog

Read more about Store and search high-volume logs with ClickHouse and Datadog

DASH 2026 Operating at Scale: Guide to Datadog's newest announcements

Jun 9, 2026 By Datadog In Datadog

A challenge for many teams continues to be managing cost, governance, and reliability across an ever-larger footprint. This year’s DASH announcements help teams operate efficiently at scale, with new tools to cut cloud and AI spend, eliminate waste automatically, maintain observability during outages, and manage many organizations and agents as a single unit.

Read Post

Datadog

Read more about DASH 2026 Operating at Scale: Guide to Datadog's newest announcements

Turn Datadog findings into automated code fixes with Bits Code

Jun 9, 2026 By Datadog In Datadog

Engineering teams lose hours in the gap between detecting a problem and getting a fix into review. An on-call engineer sees an error spike in Datadog, pivots to traces and logs to isolate the failure, opens the relevant repository, reproduces the issue, writes a fix, adds tests, waits on CI, and finally opens a pull request. Even when the problem is familiar, the workflow pulls engineers across several tools and stretches remediation from minutes into hours or days.

Read Post

Datadog

Read more about Turn Datadog findings into automated code fixes with Bits Code

Infinite Cardinality Metrics: Custom metrics built for modern systems

Jun 9, 2026 By Josh Mirchin In Datadog

Every technology shift adds new context you need to measure. Cloud computing added regions and services. Kubernetes added containers and pods. Multi-tenant applications added users and tenants. AI systems add models, prompts, agents, and execution paths. The result is that metrics are becoming dramatically more dimensional, faster than ever before. Over time, engineers are forced to make tradeoffs.

Read Post

Datadog

Read more about Infinite Cardinality Metrics: Custom metrics built for modern systems

Get reliable answers to business questions with Bits Data Analysis

Jun 9, 2026 By Jonathan Morin In Datadog

Teams are wiring AI coding agents straight to their warehouse over MCP and asking things like “What was our revenue by channel in Q2?” The agent finds a revenue table, runs a query, and returns a number in seconds, with no waiting on the data team. While the answer initially looks right, the problem is that the number is often wrong.

Read Post

Datadog

Read more about Get reliable answers to business questions with Bits Data Analysis

Autonomously monitor for impactful degradations with Bits Detection

Jun 9, 2026 By Samantha Scaglione In Datadog

Monitoring is built around the system a team understands at a point in time. Engineers add endpoints, move dependencies, and change user flows every day. Over time, that creates coverage drift as monitors keep reflecting the system as it used to behave, while changing paths introduce failure modes that teams didn’t yet know to watch for. Bits Detection automatically creates, tunes, and maintains monitors for your services.

Read Post

Datadog

Read more about Autonomously monitor for impactful degradations with Bits Detection

Search and act across Datadog to resolve issues faster with Bits Chat

Jun 8, 2026 By Nicole Parisi In Datadog

Finding the right information across dashboards, monitors, and telemetry sources takes time, even for experienced engineers. When something breaks, it often means figuring out where to start, rebuilding queries, and jumping between metrics, logs, and traces before you can take action. The challenge isn’t a lack of data but the effort required to surface the right information at the right moment.

Read Post

Datadog

Read more about Search and act across Datadog to resolve issues faster with Bits Chat

Give your AI agents live Datadog access from the command line

Jun 5, 2026 By Cody Lee In Datadog

AI agents are becoming a standard part of how engineers write, deploy, and troubleshoot software. Getting observability data into those workflows, securely and without manual intervention, remains the harder problem.

Read Post

Datadog

Read more about Give your AI agents live Datadog access from the command line

Introducing Bits Agent Builder: Build agentic workflows for alert response and remediation

Jun 4, 2026 By Amber Tunnell In Datadog

Building automated workflows that adapt to real-world complexity can be a challenge. As systems scale and scenarios multiply, teams often end up hardcoding endless logic branches just to handle every potential outcome. That’s why we’re introducing Bits Agent Builder, a powerful new tool that lets you create custom AI agents that are fully hosted by Datadog.

Read Post

Datadog

Read more about Introducing Bits Agent Builder: Build agentic workflows for alert response and remediation

Migrate to Azure Managed Redis with Datadog and Eden

Jun 1, 2026 By Michael Cronk In Datadog

Azure Managed Redis is a Microsoft first-party, fully managed in-memory data store, replacing Azure Cache for Redis tiers. It includes Redis Enterprise features such as RediSearch for vector search and full-text search, in addition to RedisJSON, RedisTimeSeries, and Active Geo-Replication. As Azure Cache for Redis reaches end of life, more teams are planning migrations to Azure Managed Redis in search of better performance, lower cost, and modern capabilities for AI and real-time workloads.

Read Post

Datadog

Read more about Migrate to Azure Managed Redis with Datadog and Eden

How we cut Spark compute costs by 44% with agentic AI and Datadog Jobs Monitoring

Jun 1, 2026 By Charles Yu In Datadog

Spark jobs only get more expensive and harder to debug as they scale. It’s a problem we’ve run into ourselves. Our Referential Data Platform team builds and maintains the knowledge graph that maps relationships between customers’ observability entities. ServiceQueryEdge is at the center of that graph, mapping service entities to their associated metric and log queries.

Read Post

Datadog

Read more about How we cut Spark compute costs by 44% with agentic AI and Datadog Jobs Monitoring

A deep dive into AWS data perimeter misconfigurations

Jun 1, 2026 By Mallory Mooney In Datadog

In AWS environments, a data perimeter is a set of preventative controls that help ensure that your trusted cloud identities (principals or AWS services acting on your behalf) are accessing trusted resources from authorized networks. You can apply these controls at various levels of your infrastructure, such as per resource or across all resources in your AWS account.

Read Post

Datadog

Read more about A deep dive into AWS data perimeter misconfigurations

Operations | Monitoring | ITSM | DevOps | Cloud

Datadog acquires Adaptive ML

5 pitfalls to avoid when measuring DevEx in the AI era

Debug and evaluate your AI app from your coding agent with Datadog Agent Observability

The Journey to Achieving Hyperscale Availability with AI-Driven Prediction

Women In Tech Panel - Engineering with AI in the Stack

The AI Engineering Playbook: How to Evaluate & Iterate at Every Phase of Development

From Legacy to AI-Ops: Securing and Scaling Systems for 20M Device Requests with Datadog

Ship Reliable AI Faster: How to Operate AI Agents with Control and Confidence

Reduce CDN log costs with searchable archives

How we saved over $3 million in idle compute costs with Datadog Kubernetes Autoscaling

How to migrate feature flags without breaking production

Using Evaluation Frameworks with Agent Observability

How Coding Agents are Changing the Traditional Software Development Lifecycle

Fireside Chat with Datadog CPO Yanbing Li and Vercel CPO Tom Occhino

The New Shape of Engineering

Progressing AI Beyond Scaling and Into Deep Reasoning

Datadog Data Observability: Be the first to know when data fails

DASH 2026 Keynote

Store and search high-volume logs with ClickHouse and Datadog

DASH 2026 Operating at Scale: Guide to Datadog's newest announcements

Turn Datadog findings into automated code fixes with Bits Code

Infinite Cardinality Metrics: Custom metrics built for modern systems

Get reliable answers to business questions with Bits Data Analysis

Autonomously monitor for impactful degradations with Bits Detection

Search and act across Datadog to resolve issues faster with Bits Chat

Give your AI agents live Datadog access from the command line

Introducing Bits Agent Builder: Build agentic workflows for alert response and remediation

Migrate to Azure Managed Redis with Datadog and Eden

How we cut Spark compute costs by 44% with agentic AI and Datadog Jobs Monitoring

A deep dive into AWS data perimeter misconfigurations

Monthly Archive

Follow Us