Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Monitor Amazon MemoryDB with Datadog

Amazon MemoryDB for Redis is a highly durable in-memory database service that uses cross-availability-zone data storage and fast failover, providing microsecond read times and single-digit-millisecond write times. Datadog’s integration for MemoryDB uses a range of metrics to provide important visibility into MemoryDB performance.

Quickly and comprehensively analyze the cloud and SaaS costs behind your services

Understanding costs is an essential part of service ownership. But in cloud-based applications, the cost of any given service often comes down to a wide range of dynamic factors. Individual services can incur fees from numerous dependencies, from data stores to observability solutions, and keeping track of these expenses can mean reckoning with the intricacies of many different billing models.

Transform and enrich your logs with Datadog Observability Pipelines

Today’s distributed IT infrastructure consists of many services, systems, and applications, each generating logs in different formats. These logs contain layers of important information used for data analytics, security monitoring, and application debugging. However, extracting valuable insights from raw logs is complex, requiring teams to first transform the logs into a well-known format for easier search and analysis.

Why care about exception profiling in PHP?

A few months ago, we implemented support for exception profiling in PHP. One of the key justifications for building this functionality into Continuous Profiler was to show the hidden costs of exceptions in PHP, especially when they are used for flow control in hot code paths. Once this feature was built, we naturally wanted to know if it surfaced these kinds of flow control problems in customer production systems.

Get granular LLM observability by instrumenting your LLM chains

The proliferation of managed LLM services like OpenAI, Amazon Bedrock, and Anthropic have introduced a wealth of possibilities for generative AI applications. Application engineers are increasingly creating chain-based architectures and using prompt engineering techniques to build LLM applications for their specific use cases.

Integration roundup: Monitoring the health and performance of your container-native CI/CD pipelines

Widespread adoption of containerized infrastructure has been closely followed by an explosion of container-native tools for each layer of the stack, including new solutions for managing CI/CD pipelines in container-based environments, such as the Argo suite, FluxCD, and Tekton. This is because these lightweight solutions make it easier to automate builds, testing, deployments, and more on Kubernetes, as well as other platforms that manage containerized workloads and services.

Reduce alert storms in your microservices architecture with easily scalable techniques

Alert storms occur when your monitoring platform generates excessive alerts simultaneously or in succession. Although numerous factors can cause an alert storm, microservices architectures are uniquely susceptible to them due to multiple service dependencies, potential failure points, and upstream and downstream service relationships.

Introducing Toto: A state-of-the-art time series foundation model by Datadog

Foundation models, or large AI models, are the driving force behind the advancement of generative AI applications that cover an ever-growing list of use cases including chatbots, code completion, autonomous agents, image generation and more. However, when it comes to understanding observability metrics, current large language models (LLMs) are not optimal.

Recapping DASH 2024

DASH 2024 was our biggest event yet! Over two days, thousands from the Datadog community gathered at North Javits in New York City for an impactful experience. The 2024 keynote featured numerous new product launches and updates, but there was much more to enjoy beyond this speech. Attendees got to experience breakout sessions, workshops, certification exams, one-on-one Datadog consultations, and a bustling expo hall.

Optimize PostgreSQL performance with Datadog Database Monitoring

PostgreSQL is a widely used open source relational database that many organizations operate as a core part of their infrastructure stack. Because of their mission-critical nature, database-related issues can have outsize downstream impacts on user experience, service performance, and data retention, making it vital to identify and address problems quickly.