%term

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

AI Is Changing Healthcare Faster Than Most Systems Are Ready For

Feb 17, 2026 By Ritika Bramhe In OnPage

Healthcare is shifting fast, and artificial intelligence is no longer a future concept sitting in research labs or pilot programs. It’s already embedded in clinical workflows, operational systems, and patient interactions, often in ways that feel subtle, uneven, and sometimes uncomfortable.

Read Post

OnPage

Read more about AI Is Changing Healthcare Faster Than Most Systems Are Ready For

How to Set Up SMS Alerting w/ OnPage

Feb 17, 2026 By OnPage Corporation In OnPage

In this quick tutorial, learn how to set up SMS alerting in OnPage to ensure your team never misses a critical notification. We’ll walk you through the step-by-step process: This setup ensures reliable message delivery using redundancy rules, so important alerts reach the right person at the right time. Let us know if you have any other questions!

View Video

OnPage

Read more about How to Set Up SMS Alerting w/ OnPage

Runbook Automation Release Notes v5.19.0

Feb 17, 2026 By PagerDuty Inc. In PagerDuty

Join us for the latest features in Runbook Automation and Rundeck!

View Video

PagerDuty

Read more about Runbook Automation Release Notes v5.19.0

Why SIGNL4 Is the Right Alarm Management Software to Maximize Machine Availability

Feb 16, 2026 By SIGNL4 In SIGNL4

A plant runs at its best when equipment stays online, processes remain stable, tolerances are met, raw materials are delivered in time, and scrap stays low. That’s how operations teams hit production targets, meet customer SLAs, stay on schedule, keep costs under control, and maintain consistent quality. But does everything always run according to plan? Of course not.

Read Post

SIGNL4

Read more about Why SIGNL4 Is the Right Alarm Management Software to Maximize Machine Availability

Code Is Cheap, Reliability Isn't: Owning Production in the AI era w/ Swizec Teller

Feb 16, 2026 By Rootly In Rootly

In this episode, Swizec Teller, author of the bestselling Scaling Fast, makes a bold claim: code is cheap, reliability is not. As AI coding tools accelerate feature development, the real competitive advantage shifts to operating systems reliably in production. We explore the hidden complexity of SRE work, the addictive nature of agentic coding, and why ownership — not automation — remains at the core of modern software engineering.

View Video

Rootly

Read more about Code Is Cheap, Reliability Isn't: Owning Production in the AI era w/ Swizec Teller

Amazon Web Services outage - February 10, 2026

Feb 13, 2026 By Andy Libby In StatusGator

On February 10, 2026, Amazon Web Services (AWS) experienced an outage that triggered widespread reports of CloudFront failures and DNS resolution issues. While AWS later acknowledged the incident, StatusGator detected the disruption earlier using Early Warning Signals, giving customers valuable lead time before the provider confirmed anything publicly.

Read Post

StatusGator

Read more about Amazon Web Services outage - February 10, 2026

4 on-call burnout signs (and how to address them)

Feb 13, 2026 By Sreekar In Spike

Being on-call can sometimes feel overwhelming. If that feeling goes unnoticed for too long, it often translates into burnout. And early burnout signs usually show up in ways, like how people respond to incidents or how they feel about the schedule. This guide walks through four such signs that can be useful to watch for before on-call burnout sets in.

Read Post

Spike

Read more about 4 on-call burnout signs (and how to address them)

Claude outage - February 10, 2026

Feb 12, 2026 By Colin Bartlett In StatusGator

On February 10, 2026, Claude users around the world began reporting service failures affecting chat sessions, API integrations, and Claude Code workflows. The first verified outage report reached StatusGator at 19:33 UTC. StatusGator issued an Early Warning Signal at 20:24 UTC. Claude did not post an official “Investigating” update until 22:11 UTC. This incident clearly demonstrates the gap between real user impact and official status page updates.

Read Post

StatusGator

Read more about Claude outage - February 10, 2026

Follow-the-sun and other on-call models

Feb 12, 2026 By Sreekar In Spike

Most teams run on-call using rotation-based schedules where responsibility shifts every few days or weeks. But some situations call for different models that change who responds based on time zones, expertise, or the type of incident that triggers. This guide walks you through six on-call models that work outside the standard rotation patterns.

Read Post

Spike

Read more about Follow-the-sun and other on-call models

Turning Data Into Decisions with the xMatters Incident AI Agent

Feb 12, 2026 By Jon Skog In xMatters

When an incident hits, the gap between awareness and action can make all the difference. Responders know the pain: endless tool-switching, chasing updates, and fragmented data. It’s not a lack of capability that slows response; it’s the lack of context and connection. That’s why we built the xMatters Incident AI Agent, a purpose-built, conversational assistant that brings intelligence and automation directly into the heart of incident response.

Read Post