Operations | Monitoring | ITSM | DevOps | Cloud

How we use Datadog to get comprehensive, fine-grained visibility into our email delivery system

Visibility into email performance is indispensable to any organization that counts on its ability to reach people through their inboxes, including Datadog. SREs, FinOps, and many other teams rely on email as a critical channel for communications from our platform, including monitor alerts, usage reports, and service account notifications. At Datadog, we depend on the visibility provided by our integrations for Mailgun, SendGrid, and Amazon SES to optimize our email performance and ensure deliverability.

Keep stakeholders informed with Datadog Status Pages

When incidents occur, clear communication can be just as important as fast remediation. Your internal teams need timely updates to stay aligned, and your users want to know what is happening and when they can expect a fix. Without a reliable way to proactively share updates, support teams can get flooded with tickets and customer trust can erode. Datadog Status Pages, now generally available, makes it easy to keep everyone informed through a public or internal web page during outages.

Scaling Datadog observability: 1,000 integrations and counting

Integrations have always been central to the Datadog platform, enabling customers to collect the data they need directly from the technologies they use every day. By unifying signals from infrastructure and applications to security and SaaS applications, teams gain both high-level visibility and the ability to drill into the details that matter the most. With more than 1,000 integrations now available, the Datadog ecosystem continues to expand alongside the platforms our customers rely on.

Monitor Slurm with Datadog

Slurm (Simple Linux Utility for Resource Management) is an open source workload management system used to schedule jobs and manage resources for high-performance computing (HPC) Linux clusters. It ensures that jobs and resources are scheduled fairly and efficiently and is scalable across large clusters, an issue that native Linux process management tools struggle with.

Ship features faster and safer with Datadog Feature Flags

Releasing new features is one of the highest-stakes moments in the software delivery life cycle. Even with CI/CD pipelines in place, plenty of things can still go wrong when a feature goes live for actual users. Most feature flagging tools operate in isolation from important observability tooling, forcing engineers to monitor changes across multiple disconnected systems to fully understand their impact. This slows down development and increases the chance of missing critical issues.

Datadog Feature Flags, track Claude costs, migrate historical logs, and more | This Month in Datadog

See how you can reduce risk during feature rollouts in September’s This Month in Datadog. This episode, we spotlight Datadog Feature Flags, which combines advanced targeting with built-in observability, and guardrails to make rollouts safer and more controlled. Plus, we cover: This Month in Datadog brings you the latest updates on our newest product features, announcements, resources, and events.

From Logs to Insights: Accelerate Customer-Impact Analysis with Datadog Sheets

Datadog Sheets helps you move from log exploration to actionable insights quickly and with no code required. In this demo, see how to enrich logs with Salesforce data, build pivot tables, uncover customer impact trends, and build shareable reporting, all within Datadog.

Model your architecture with custom entities in the Datadog Software Catalog

Every software organization has its own unique architecture and workflows. Beyond services and APIs, teams rely on internal libraries, CI/CD jobs, data pipelines, AI agents, and more to keep systems running smoothly. But as architectures grow more complex and interconnected, it can become difficult to keep track of all the structural dependencies and interactions in one place.

Monitor your data pipelines with Airflow lineage

In complex data pipelines with dozens of jobs and intermediary datasets, it can be difficult to effectively monitor how data travels and changes through various steps. When tracking issues in these pipelines, you need visibility into upstream components where the root cause may originate from, as well as downstream datasets and consumers of data that may be experiencing further impacts.