Operations | Monitoring | ITSM | DevOps | Cloud

Best practices for end-to-end custom metrics governance

Custom metrics enable you to track what matters to your distinct business and services and correlate it with the rest of your telemetry data. As your organization grows by adding more teams, services, and environments, your volume of custom metrics can grow with it. To ensure critical visibility while maintaining cost efficiency, organizations need an end-to-end approach to custom metrics governance.

Introducing RUM without Limits: Capture everything, keep what matters

Real User Monitoring (RUM) helps teams understand exactly how their users experience their web and mobile applications—from load times to crashes and frustration signals. But traditional RUM models come with tough trade-offs: capture all sessions and overspend, or sample data and miss what matters. Fixed sampling rates may help manage volume, but they leave dangerous blind spots.

Highlights from Google Cloud Next 2025

Google Cloud Next is the biggest event of the year for the Google Cloud community, showcasing the latest and greatest offerings from Google Cloud and hundreds of its partners. As a long-time Google Cloud partner and recipient of three Google Cloud Partner of the Year awards in 2025, Datadog was there in full force, delivering several speaking sessions and running a booth on the expo floor where we met with thousands of attendees. In case you missed it, don’t worry.

Build Vega-Lite visualizations natively in Datadog with the Wildcard widget

Datadog dashboards provide a unified view of your applications, infrastructure, logs, and other observability data—making it easy to monitor health, investigate issues, and share insights across teams. While native Datadog widgets support a broad range of visualization types, some use cases call for more customized representations, particularly when you’re working with unconventional data formats, external sources, or specific transformations.

Detect hallucinations in your RAG LLM applications with Datadog LLM Observability

Hallucinations occur when a large language model (LLM) confidently generates information that is false or unsupported. These responses can spread misinformation that jeopardizes safety, causes reputational damage, and erodes user trust. Augmented generation techniques, such as retrieval-augmented generation (RAG), aim to reduce hallucinations by providing LLMs with relevant context from verified sources and prompting the LLMs to cite these sources in their responses.

Discover powerful insights with nested metric queries

To gain adequate visibility into your distributed applications, you need to observe those applications at different levels of granularity. This means that you need to be able to query collected telemetry data both at the level of the whole application and at the level of selected components. Thanks to the power of Datadog tagging, you can already do this by aggregating your metrics within any scope of your choosing.

Understand and manage your Datadog spend with Datadog cost data in Cloud Cost Management

As your organization scales its Datadog footprint, you want to understand what’s driving cost changes and promote cost awareness. But to take meaningful action, you need more than a monthly bill—you need real-time, contextualized cost data tied to services and teams. Without this visibility, it’s hard to assign ownership, prevent cost overruns, or identify which changes are affecting spend.

How we use RUM to make design decisions that enhance user experience

Before we started using Datadog Real User Monitoring (RUM), we relied on frontend logging to gather data about the user experience. Logs gave us some helpful information about exceptions and errors but didn't provide any insight into issues directly related to the user’s perspective.

Monitoring AI Proxies to optimize performance and costs

Businesses deploying LLM workloads increasingly rely on LLM proxies (also known as LLM gateways) to simplify model integration and governance. Proxies provide a centralized interface across LLM providers, govern model access and usage, and apply compliance safeguards for smoother operations and reduced complexity—making LLM usage more consistent and scalable.