Operations | Monitoring | ITSM | DevOps | Cloud

Building a DORA metrics Scorecard

There are a lot of ways to gauge the performance of your DevOps teams and the health of your software, but DORA metrics have emerged as the industry standard. If you aren’t familiar with DORA metrics, take a few minutes to read this comprehensive guide to understanding DORA metrics. DORA metrics were designed to offer a high-level, long-term view of how your teams are performing.

This Month in Datadog - August 2025

In the August episode of This Month in Datadog, Jeremy shares how you can make more informed cloud cost decisions, gain insights into your LiteLLM-powered applications, and secure Kubernetes infrastructure with Datadog Workload Protection. Later in the episode, Danny puts the spotlight on Datadog Kubernetes Autoscaling, which helps you deliver cost savings without sacrificing performance.

AWS metric ingestion for less: Save money and get near real-time stream into Grafana Cloud

There’s a new way to ingest AWS metrics into Grafana Cloud that makes observing your AWS resources more cost-effective, easier to operate, and more accurate. You can now stream metrics into the AWS Observability app in Grafana Cloud in near real-time thanks to our new integration with Amazon CloudWatch and Amazon Data Firehose. We’re already using it internally, and we’re finding that it’s not only easier to operate—it’s at least five times more cost-effective.

Visualize Logs Alongside Metrics: Complete Observability for Slow MongoDB Operations

MongoDB’s strength of flexible schema and fast iteration can also hide costly queries until they surface as user-facing latency, replica lag, or spiky CPU. A handful of slow operations can impact the cache, starve other workloads, and cascade into timeouts across services. Monitoring slow queries gives you an early warning system for index gaps and query-plan regressions introduced by code deploys, schema changes, or shifting data shapes.

The hidden costs of shadow AI: CPU drain, data risk, and network bottlenecks

The risk of headline-grabbing incidents, like Samsung’s ChatGPT data leak, related to AI usage outside of the authorization and control of IT (a.k.a. shadow AI) is clear. Most IT teams recognize that a high-profile incident can have serious repercussions. However, the risk of shadow AI goes well beyond the risk of a single incident. In fact, the recent Komprise IT Survey indicates that 79% of organizations have experienced negative outcomes from sending corporate data to AI.

10 Ways to Optimize Data Center Operations

Running a data center efficiently is no small feat. From managing energy costs to preventing downtime, there's a lot that can go wrong—and a lot that can be optimized. Discover 10 actionable ways to enhance your data center operations, with practical tips on how Hyperview DCIM software can help you achieve these improvements more easily and effectively.

Weaving AppNeta Experience Insights into DX NetOps: A Step-by-Step Guide

Today’s enterprise networks aren’t constrained to a single location—they span continents, clouds, and providers, and they’re relied upon by users who can work from anywhere. For network operations teams, that means every issue is a potential scavenger hunt. Is it the app? The WAN? The cloud provider? The ISP? The stakes are high and your tools need to evolve. That’s why the integration of DX NetOps and AppNeta is such a game-changer.

Why database monitoring is critical for application performance

When an application slows down, users rarely think about the database—but in many cases, that’s where the bottleneck lies. Databases sit at the core of nearly every application, storing, retrieving, and processing the information that powers business transactions, analytics, and user interactions. A minor inefficiency in query execution or a spike in resource usage can cascade into multiple issues, starting with degraded application performance, service interruptions, or even downtime.

Bringing Observability to Claude Code: OpenTelemetry in Action

AI coding assistants like Claude Code are becoming core parts of modern development workflows. But as with any powerful tool, the question quickly arises: how do we measure and monitor its usage? Without proper visibility, it’s hard to understand adoption, performance, and the real value Claude brings to engineering teams. For leaders and platform engineers, that lack of observability can mean flying blind when it comes to understanding ROI, productivity gains, or system reliability.

Top 3 Jira reporting tools: SquaredUp vs Power BI vs Jira

A recent survey revealed that developers and engineering teams waste 8+ hours a week on inefficiencies in their role. Poor reporting tools are a main contributor, with Jira being regarded as a frequent source of friction. But since Jira is so deeply embedded in most organizations' infrastructure and processes, replacing it is not really an option. Rather, the solution lies in optimizing how users interact with it rather than abandoning it altogether.