Operations | Monitoring | ITSM | DevOps | Cloud

Datadog on Cloud Workload Identities

Datadog operates dozens of Kubernetes clusters, tens of thousands of hosts, and millions of containers across a multi-cloud environment, spanning AWS, Azure, and Google Cloud. With over 2,000 engineers, we needed to ensure that every developer and application could securely and efficiently access resources across these various cloud providers.

Detect and troubleshoot Windows Blue Screen errors with Datadog

Windows Blue Screen errors—also known as bug checks, STOP codes, kernel errors, or the Blue Screen of Death (BSOD)—are triggered when the operating system detects a critical issue that compromises system stability. To prevent further damage or data corruption, the OS determines that the safest course of action is to shut down immediately. The system then restarts and displays the well-known BSOD.

Integrate usage data into your product analytics strategy

Web applications emit a wealth of metadata and user interaction information that’s critical to understanding user behavior. However, parsing this data to find what is most relevant to your product analytics project can be challenging—what one product analyst might find useful, another might consider unnecessary noise.

Get complete Kubernetes observability by monitoring your CRDs with Datadog Container Monitoring

Custom resources are critical components in Kubernetes production environments. They enable users to tailor Kubernetes resources to their specific applications or infrastructure needs, automate processes through operators, simplify the management of complex applications, and integrate with non-native applications such as Kafka and Elasticsearch.

A guide on scaling out your Kubernetes pods with the Watermark Pod Autoscaler

While overprovisioning Kubernetes workloads can provide stability during the launch of new products, it’s often only sustainable because large companies have substantial budgets and favorable deals with cloud providers. As highlighted in Datadog’s State of Cloud Costs report, cloud spending continues to grow, but a significant portion of that cost is often due to inefficiencies like overprovisioning.

Kubernetes autoscaling guide: determine which solution is right for your use case

Kubernetes offers the ability to scale infrastructure to accommodate fluctuating demand, enabling organizations to maintain availability and high performance during surges in traffic and reduce costs during lulls. But scaling comes with tradeoffs and must be done carefully to ensure teams are not over-provisioning their workloads or clusters. For example, organizations often struggle with overprovisioning in Kubernetes and wind up paying for resources that go unused.

Monitor Azure AI Search with Datadog

Azure AI Search is Microsoft Azure’s managed search service. In addition to tackling traditional search use cases, Azure AI Search also includes AI-powered features to make it a fully capable document catalog, search engine, and vector database. AI Search is highly interoperable—it can use models created in Azure OpenAI Service, Azure AI Studio, or Azure ML.

Use Datadog App Builder to peak, purge, or redrive AWS SQS.

This video aims to showcase how developers can self-serve from an application to simplify the management of their AWS cloud resources. Rather than switching between tools or reaching out to another team for help, developers can take action directly from their observability tool, enabling faster resolution of application issues. We will demonstrate how to build a simple app that allows them to minimize disruptions by quickly taking action on their SQS queues in AWS, using insights provided by Datadog.

Troubleshoot and resolve Kubernetes issues with AI-powered guided remediation

As teams adopt Kubernetes at greater scale, they face increased complexity in keeping their growing list of workloads and services up and running. Achieving the visibility and context needed to detect and resolve incidents quickly is difficult amid a constant flood of telemetry data and alerts. Furthermore, Kubernetes expertise often remains siloed in DevOps and infrastructure teams.