Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Smarter Alert Management: Test on Historical Data, Review Transitions, and Preview Silencing Schedules

Alert fatigue usually isn’t caused by one thing. It’s the accumulation of thresholds that are slightly too sensitive, alerts that fire during known maintenance windows, and historical patterns that nobody has the tools to review easily. Fixing it requires better visibility into how alerts actually behave over time, and a way to test changes before they hit production. We’ve shipped three improvements to alerting in Netdata that address different parts of this problem.

The Edwin AI Agent Orchestrator: Coordinated Incident Investigation Across the Tools You Already Use

Edwin AI’s Agent Orchestrator keeps incident investigation, context, and response aligned as work moves across tools, eliminating the manual handoffs that slow resolution. Every major incident has two timelines running in parallel. The first is the incident itself—services degrading, users affected, business impact accumulating. The second is quieter and just as costly: engineers switching tabs, re-explaining context to new responders, moving notes from one tool to another by hand.

Building a Unified Enterprise Observability Strategy Webinar

Join Graham Davies, Technical Product Manager at SquaredUp as he provides a practical guide to breaking down data silos between IT, operations and the business. In this session, Graham digs into why dashboard and tool sprawl is making decisions harder, not easier, and shows you a practical framework for building a single source of truth your whole organisation can rely on.

Sentry Built AI Dashboards: Monitor Your AI Agents End-to-End

Building AI applications? There's a lot more to monitor beyond errors. With tracing enabled, Sentry's built-in AI Dashboards give you deep visibility into how your agents are actually performing. This video walks through three key dashboard views: You'll also see how to drill from a dashboard widget straight into the trace explorer to pinpoint the root cause of errors, how to duplicate and customize dashboards to fit your needs, and how to set up monitors with alert thresholds - like getting notified if your LLM calls exceed 20 seconds.

Unified Enterprise Monitoring that Scales

Modernize your monitoring stack with the Progress WhatsUp Gold network monitoring solution in this fast, 30‑minute session. Learn how to replace legacy, multi‑module tools with one unified platform that simplifies operations, boosts visibility and delivers predictable TCO. Discover how NetOps and ITOps teams can reduce complexity and get actionable insights faster by utilizing the WhatsUp Gold capabilities to unify network traffic analysis, logs, configuration and high availability.

Modern IT and the Burden of Accountability

The leaders responsible for modern IT environments rarely talk about features first. They talk about responsibility. In conversations at Nexus Live 2025, ScienceLogic’s annual customer conference, executives and architects across healthcare, federal systems, managed services, telecom, and enterprise IT described modernization not as a tooling upgrade, but as an escalation of accountability.