Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

New API: Submit outage reports

We’ve added a new endpoint to the StatusGator API that allows you to submit outage reports for monitors on your board. With the new Outage Reports API, you can programmatically report issues you’re experiencing with a service. These reports help StatusGator detect outages faster and improve visibility for other users who rely on the same services.

Full-Stack Observability Is Becoming a Business Imperative

As enterprises accelerate digital transformation, technology performance has become inseparable from business performance. Customer experiences, revenue streams, and operational efficiency increasingly depend on the reliability of complex, distributed systems. In this environment, full-stack observability is no longer a technical aspiration — it is a strategic necessity.

Trends in Mainframe Modernization: Fresh Insights from SHARE Orlando

Fresh insights from SHARE Orlando reveal mainframe modernization isn't about replacement—it's evolution. From hybrid architectures to AI-driven automation, enterprises are transforming legacy systems into agile, integrated platforms while preserving core reliability.

Observability for Azure Virtual Desktop with SquaredUp

Managing Azure Virtual Desktop doesn’t have to mean jumping between portal blades, logs, and metrics trying to piece together what’s happening. In this webinar, you’ll learn how to design and implement a single, operational observability dashboard for Azure Virtual Desktop (AVD) using SquaredUp Cloud — transforming fragmented telemetry into clear, actionable insight. Whether you're responsible for performance, user experience, or operational stability, this session will give you a structured, repeatable framework for monitoring your AVD estate with confidence.

Datadog Incident Response: One platform from alert to resolution

When incidents strike, speed and clarity are critical. Datadog Incident Response brings the full incident lifecycle into one platform so teams can move from detection to resolution with confidence. Operate from a single, unified view of your systems, coordinate across the tools your teams already use, and leverage AI that analyzes incidents in real time to surface context, guide decisions, and accelerate resolution.

What is Agentic Observability?

Agentic observability is the instrumentation and correlation needed to explain and control agent behavior across multi-step workflows. Legacy observability focuses on runtime health and service behavior. You monitor metrics like CPU usage, memory, latency, and error rates to confirm that applications and infrastructure are functioning as expected. When a workflow degrades, the proximate cause is often a crash, timeout, permission error, or resource constraint.

How Autonomous Are Your IT Operations, Really?

This post introduces a six-level maturity model that defines what true autonomy looks like in IT operations, from basic AI chat interfaces to fully coordinated agent ecosystems. ITOps teams have more automation tooling than ever, and yet incident response still depends heavily on human judgment to hold it together. Alerts fire, engineers dig through dashboards, context gets assembled by hand, and someone at the end of the workflow makes the final call.