Operations | Monitoring | ITSM | DevOps | Cloud

How to Track Cloud Costs in Real-Time Instead of Waiting Days

Tired of waiting days to see your AWS bill spike? Datadog solved this problem using Apache Iceberg to deliver real-time cloud cost visibility - updating every 15 minutes instead of waiting for billing data. Here's how it works: They sync real-time resource inventory (EC2 instances, Kubernetes pods) into Iceberg tables, then use Trino to join those snapshots with unit pricing data. The result? FinOps teams can catch cost anomalies before they become budget disasters.

How Datadog Manages 50,000 Apache Iceberg Tables at Scale

Think managing a few database tables is hard? Try 50,000 production Iceberg tables storing petabytes of data with 8 million scans per day. In this clip, Datadog's platform team reveals the architecture choices behind their managed Iceberg implementation that serves hundreds of internal engineering teams.

Datadog at AWS re:Invent, Bits AI SRE, MCP Server, CloudPrem, and more | This Month in Datadog

Get a closer look at features we announced at AWS re:Invent in the latest episode of This Month in Datadog. Tune in for spotlights of Bits AI SRE, now generally available, and Datadog’s MCP Server, which connects AI agents to our platform by ingesting prompts and mapping them to Datadog resources and data. Plus, we cover how to: This Month in Datadog brings you the latest updates on our newest product features, announcements, resources, and events.

Datadog on Apache Iceberg

Historically, Datadog has relied on technologies like Snowflake and Apache Spark on raw parquet files (lacking consistent table structure) to power internal analytics and data science at scale. As usage grew across product teams, more features depended on data science teams, and our datasets grew to include more telemetry data, these systems became complex to manage and govern both technically and financially. The need for a more flexible and scalable solution led Datadog to adopt Apache Iceberg, an open source table format for data lakes that brings reliability and performance while remaining SQL-friendly.

Optimize Your Oracle Cloud (OCI) Spend with Datadog Cloud Cost Management

Support for Oracle Cloud Infrastructure (OCI) is now live in Datadog Cloud Cost Management. In this short demo, you’ll learn how to: Get granular visibility into OCI cost and usage—by service, compartment, tag, and resource tier. Uncover savings opportunities by combining cost data with observability metrics like CPU, memory, and storage utilization. Set up anomaly monitors and budgets to avoid cost overruns—especially for high-risk workloads like AI and GPU training.

Datadog Bits AI SRE: Your new teammate for on-call shifts

Bits AI SRE is an always-on SRE agent built to handle complex troubleshooting and late-night alerts. Developed against thousands of real-world incidents and powered by Datadog’s platform, Bits AI SRE analyzes your entire stack, tests hypotheses, and identifies root causes in minutes. Resolve faster, get back to sleep sooner, and give your on-call team the confidence and capacity they need.

Introducing Bits AI SRE, your AI on-call teammate

Bits AI SRE is your AI on-call teammate, built to autonomously investigate alerts and coordinate incident response. Integrated with Datadog, Slack, GitHub, Confluence, and more, Bits analyzes telemetry, reads documentation, and reviews recent deployments to determine the root cause of alerts—often before you’ve even opened your laptop. In fact, if you're using Datadog On-Call, you can view Bits’s findings right from your phone—so you’re always one step ahead, no matter where you are.

Data Observability: Build confidence in the data life cycle

Datadog Data Observability provides a complete solution with quality checks (e.g., volume, row changes, freshness), custom SQL-based monitors, anomaly detection, column-level lineage across systems like Snowflake and Tableau, full pipeline visibility, and targeted alerts when data issues arise.

Explore Cloud Instance Pricing and Performance with Datadog Instance Explorer

Meet Datadog Instance Explorer — a way to explore, compare, and monitor cloud instance pricing and performance across AWS, Azure, and Google Cloud in one place. In this quick overview, you’ll learn how to: Start exploring your instance options today and make smarter, data-driven infrastructure decisions.