Operations | Monitoring | ITSM | DevOps | Cloud

Keep stakeholders informed with Datadog Status Pages

When incidents occur, clear communication can be just as important as fast remediation. Your internal teams need timely updates to stay aligned, and your users want to know what is happening and when they can expect a fix. Without a reliable way to proactively share updates, support teams can get flooded with tickets and customer trust can erode. Datadog Status Pages, now generally available, makes it easy to keep everyone informed through a public or internal web page during outages.

Scaling Datadog observability: 1,000 integrations and counting

Integrations have always been central to the Datadog platform, enabling customers to collect the data they need directly from the technologies they use every day. By unifying signals from infrastructure and applications to security and SaaS applications, teams gain both high-level visibility and the ability to drill into the details that matter the most. With more than 1,000 integrations now available, the Datadog ecosystem continues to expand alongside the platforms our customers rely on.

Monitor Slurm with Datadog

Slurm (Simple Linux Utility for Resource Management) is an open source workload management system used to schedule jobs and manage resources for high-performance computing (HPC) Linux clusters. It ensures that jobs and resources are scheduled fairly and efficiently and is scalable across large clusters, an issue that native Linux process management tools struggle with.

Ship features faster and safer with Datadog Feature Flags

Releasing new features is one of the highest-stakes moments in the software delivery life cycle. Even with CI/CD pipelines in place, plenty of things can still go wrong when a feature goes live for actual users. Most feature flagging tools operate in isolation from important observability tooling, forcing engineers to monitor changes across multiple disconnected systems to fully understand their impact. This slows down development and increases the chance of missing critical issues.

Model your architecture with custom entities in the Datadog Software Catalog

Every software organization has its own unique architecture and workflows. Beyond services and APIs, teams rely on internal libraries, CI/CD jobs, data pipelines, AI agents, and more to keep systems running smoothly. But as architectures grow more complex and interconnected, it can become difficult to keep track of all the structural dependencies and interactions in one place.

Monitor your data pipelines with Airflow lineage

In complex data pipelines with dozens of jobs and intermediary datasets, it can be difficult to effectively monitor how data travels and changes through various steps. When tracking issues in these pipelines, you need visibility into upstream components where the root cause may originate from, as well as downstream datasets and consumers of data that may be experiencing further impacts.

Proactively monitor Kerberos-authenticated web apps and APIs with Datadog Synthetics

When employee authentication fails or becomes unreliable, users can lose access to the critical systems they need. Authentication enables access to internal tools like HR applications, finance portals, and internal dashboards, so even short outages can interrupt day-to-day work, while persistent issues increase the risk of broader operational disruption.

Track the performance of your HPC workloads with Datadog's AWS PCS integration

AWS Parallel Computing Service (AWS PCS) is a managed service that helps users run and scale their high performance computing (HPC) workloads. AWS PCS uses Slurm, an open source workload manager, for scheduling and orchestrating simulations, which enables users to build their scientific and engineering models in a familiar HPC environment.

Monitor Windows Certificate Store with Datadog

The Windows Certificate Store is a critical component of any modern Windows environment. Certificates enable TLS encryption for Internet Information Services (IIS)-hosted applications, support certificate-based authentication in Active Directory, and help validate the identity of trusted Windows services. But if a certificate in your store expires, is revoked, or is part of a broken certificate chain, you risk instability and security gaps in your Windows environment.

Visually identify observability gaps with Cloudcraft in Datadog

Modern cloud environments are highly complex and dynamic, with critical services relying on large numbers of ephemeral resources. Ensuring observability coverage across this landscape is essential for troubleshooting, maintaining reliability, optimizing performance, and enforcing security standards. But as environments grow more elaborate and their ownership more dispersed, tracking observability coverage becomes increasingly challenging.