Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Show me the (meeting) money: How to monitor the real-time costs of a meeting in Grafana

This meeting could’ve been an email. It’s a phrase most of us have said (or at least thought) at some point in our careers. For me, that realization hit years ago while working for a government organization. I’d frequently sit through long, agendaless meetings that seemingly went nowhere. I wasn’t sure why I was there. And because I’m an engineer at heart, I started to wonder: what were these meetings actually costing the organization?

The Hidden Barrier to Network Automation Isn't Your AI - It's Your Data

For years, the promise of AI-driven network automation has loomed large. Vendors and analysts alike have painted a future where autonomous operations handle outages before they happen, root causes are explained instantly, and teams finally escape the endless cycle of alerts, tickets, and manual troubleshooting. But in practice, most automation initiatives stall long before they reach that vision.

Ingest OTLP metrics directly into Datadog with the new OTLP Metrics API

Many organizations rely on OpenTelemetry (OTel) to standardize observability across distributed systems. These organizations are at varying stages of adoption and are implementing OTel in complex environments with diverse configurations. To support this range of use cases, Datadog offers many ways to use OpenTelemetry with Datadog.

A deep dive into Java garbage collectors

Historically, developers have relied on languages like C and C++ for explicit control over memory allocation and deallocation. This approach can yield very low overhead and tight control over performance, but it also increases complexity and risk (e.g., memory leaks, dangling pointers, and double frees). This often results in runtime issues that are difficult to diagnose, which can become a drag on team velocity.

Track, debug, and roll back changes with Version History for Synthetic Monitoring tests

A synthetic test is only useful if you can trust what it’s telling you. When one fails, the reason may not be obvious. Was the application updated? Did the test change? Or both? As more people contribute and refine the same test, it becomes harder to understand what changed or restore a working version. Without clear visibility into those updates, teams can spend more time tracking down the cause of a failure than resolving it.

Onboarding Microsoft Sentinel data lake with DataStream

Modern security operations teams face an overwhelming challenge: a rapidly growing volume of logs, alerts, and telemetry from cloud services, on-premises infrastructure, and third-party security tools. Traditional SIEM platforms often struggle to scale cost-effectively and provide the agility needed for advanced analytics and threat hunting.

SharePoint Server Monitoring: Uptime, Performance & SLAs

SharePoint is the backbone of internal collaboration for countless organizations. It hosts documents, drives workflows, powers intranets, and underpins team communication across departments. But when it slows down—or worse, goes dark—productivity grinds to a halt. The problem is that most monitoring approaches treat SharePoint like a static website. They check availability, not experience.

Powering Mexico's Digital Future: Expanded Internet Observability with Catchpoint

As of 2025, more than 110 million Mexicans are online, putting digital‐access penetration at roughly 83% of the population. Mexico is already one of Latin America’s anchor markets, leading the region in startup momentum, cloud adoption, and cross-border digital trade. A few days ago, CloudHQ announced a $4.6B investment in Mexico to open multiple datacenters. Yet even with this scale, service quality still varies dramatically across cities, states, and ISPs.

How Leading Businesses Achieved Greater Uptime with Atatus Monitoring

When every second of downtime can mean lost revenue and frustrated customers, leading businesses can’t afford to leave performance to chance. That’s why leading companies are turning to Application Performance Monitoring (APM) tools like Atatus, a Datadog alternative to keep their applications healthy, detect issues before customers do, and achieve higher uptime than ever. But how exactly are they doing it?