Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Datadog named a Leader in first ever 2024 Gartner Magic Quadrant for Digital Experience Monitoring

We are thrilled to announce that Datadog has been named a Leader in the first ever 2024 Gartner Magic Quadrant for Digital Experience Monitoring. Datadog was positioned the highest in its Ability to Execute. We believe this placement reflects our commitment to being an end-to-end observability platform that brings together all signals from across your tech stack into a unified ecosystem.

Trace your applications end to end with Datadog and OpenTelemetry

As teams adopt OpenTelemetry (OTel) to instrument their systems in a vendor-neutral way, they often face a challenge in effectively tracing activity throughout their entire stack, from frontend user interactions to backend services and databases. While OTel enables basic tracing, teams still need a way to access advanced capabilities like continuous profiling to adequately optimize performance and troubleshoot issues in their applications.

Flaky tests: their hidden costs and how to address flaky behavior

Flaky tests are bad—this is a fact implicitly understood by developers, platform and DevOps engineers, and SREs alike. When tests flake (i.e., generate conflicting results across test runs, without any changes to the code or test), they can arbitrarily fail builds, requiring developers to re-run the test or the full pipeline. This process can take hours—especially for large or monolithic repositories—and slow down the software delivery cycle.

Monitor your Azure OpenAI applications with Datadog LLM Observability

Azure OpenAI Service is Microsoft’s fully managed platform for deploying generative AI services powered by OpenAI. Azure OpenAI Service provides access to models including GPT-4o, GPT-4o mini, GPT-4 Turbo with Vision, DALLE-3, and the Embeddings model series, alongside the enterprise security, governance, and infrastructure capabilities of Azure.

Generate metrics from your high-volume logs with Datadog Observability Pipelines

Logs are a rich source of information, providing you with the minute details you need to troubleshoot a specific issue or perform extensive historical analysis. But with billions of logs being generated from your infrastructure every day, it isn’t practical to sift through them all to derive actionable insights. Firewall, CDN, network activity, and load balancer logs are especially high volume, requiring storage solutions that can be expensive and difficult to scale.

Reduce your AWS Step Functions' error remediation time by redriving executions directly from Datadog

AWS enables customers to retry or redrive Step Functions executions to continue any failed executions of Standard Workflows from their points of failure while maintaining all inputs. For example, if you find broken downstream logic in your code or experience unexpected errors upon execution, you can remediate those errors by fully re-running an execution or use redrive to continue this execution.

Gain visibility into your Camunda 8 components with Bordant Technologies' Datadog integration

Camunda 8 is a process orchestration platform that automates and executes business processes at scale. Many organizations orchestrate their business processes using Camunda 8 Self-Managed because it can operate in their preferred public cloud provider, such as AWS, or in a private cloud, like a Kubernetes cluster. However, hosting Camunda 8 while maintaining its health and performance will require complete visibility into your environment, helping you properly allocate resources and minimize downtime.

How to spot and fix memory leaks in Go

A memory leak is a faulty condition where a program fails to free up memory it no longer needs. If left unaddressed, memory leaks result in ever-increasing memory usage, which in turn can lead to degraded performance, system instability, and application crashes. Most modern programming languages include a built-in mechanism to protect against this problem, with garbage collection being the most common. Go has a garbage collector (GC) that does a very good job of managing memory.

How we used Datadog to save $17.5 million annually

Like most organizations, we are always trying to be as efficient as possible in our usage of our cloud resources. To help accomplish this, we encourage individual engineering teams at Datadog to look for opportunities to optimize. They can share their performance wins, big or small, in an internal Slack channel along with visualizations and, often, calculations of the resulting annual cost savings.

Optimize your AWS costs with Cloud Cost Recommendations

Managing your AWS costs is both crucial and complex, and as your AWS environment grows, it becomes harder to know where you can optimize and how to execute the necessary changes. Datadog Cloud Cost Management provides invaluable visibility into your cloud spend that enables you to explore costs and investigate trends that impact your cloud bill.