Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Uptrace Errors & Logs Tutorial: Capture Stacktraces, Context, and Traces in One Place

Every error tells a story — and Uptrace helps you see the full picture. In this tutorial, you’ll learn how to use Uptrace to capture errors, logs, stacktraces, and request context in a single observability platform. See how errors automatically link to traces, understand exactly what happened, and debug issues faster with rich attributes, user data, and performance impact. What you’ll learn: Understand not just *what broke*, but *who it affected and why* — and fix problems with confidence using Uptrace.

AWS CloudFront Outage (Feb 2026): Timeline, Cascade, and Lessons

At approximately 9:15 PM UTC on February 10, 2026, Amazon CloudFront began returning NXDOMAIN responses for DNS queries against specific distributions. In practical terms: DNS was telling users that services behind those distributions simply didn't exist. The root cause was a DNS resolution failure within CloudFront's infrastructure that quickly spread to eight interconnected AWS services.
Sponsored Post

From cloud costs to cloud value: The role of performance analytics in increasing ROI

Many cloud providers offer services that scale with usage. However, unanticipated overutilization of compute instances, serverless functions, or managed databases can quickly drive up costs. Managing these resources effectively is crucial for keeping cloud spending predictable.

AI Query Assist for SolarWinds Database Performance Analyzer

Is your database slow? Let AI do the heavy lifting. Watch how SolarWinds DPA’s AI Query Assist transforms query tuning from a manual headache into a streamlined process. This demo shows you how to get instant, AI-powered recommendations for your worst-performing queries while maintaining the control to review and verify every fix. It’s not just about finding the problem—it’s about fixing it faster.

VictoriaMetrics at FOSDEM, Cloud Native Days France, and CfgMgmtCamp Ghent

Last week, members of the VictoriaMetrics team, including myself, spoke at three very different but equally important community events: FOSDEM in Brussels, Cloud Native Days France in Paris, and CfgMgmtCamp in Ghent. Each event drew a different crowd with its own expectations, making them a good way to see where open source observability stands today and how VictoriaMetrics is adapting to real-world needs. The talks we gave were snapshots of the problems we are actively working on.

How to run checks on internal services with Grafana Cloud Synthetic Monitoring

Many critical services run inside private networks, where traditional monitoring tools and practices can’t offer full visibility. This makes it difficult to validate service availability and performance before problems impact your users. Synthetic Monitoring — a Grafana Cloud solution that helps you proactively monitor the performance of your applications and services — addresses this gap with a feature known as private probes.

What is DEX Ops?

For decades, IT operations have been built around incidents, SLAs, and ticket closure rates. Success has been defined by how quickly tickets are resolved and whether service levels are met. But the modern digital workplace has changed. Employee productivity, digital adoption, collaboration quality, and business performance depend on far more than ticket metrics. A device that “works” but performs poorly still erodes productivity.

How to Create and Manage Incidents in Uptime.com

Learn how to create and manage incidents on your Uptime.com Status Page to keep your subscribers informed about service disruptions and maintenance events in real-time. In this tutorial, we'll cover understanding incident statuses (Investigating, Identified, Monitoring, Resolved, and more), three ways to create a new incident, configuring incident details and timelines, adding updates with Markdown formatting, managing and editing incidents, notifying Status Page subscribers, and using the REST API for incident management.

What Agentic AI Is Really Made Of (Most People Miss This)

Agentic AI isn’t just an LLM. Without the right context, it gives generic answers. This is the component that makes its decisions actually useful. Additional Resources: About Elastic Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale. Elastic’s solutions for search, observability, and security are built on the Elastic Search AI Platform — the development platform used by thousands of companies, including more than 50% of the Fortune 500.