Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Why Faster Recovery Beats Faster Shipping in the AI Era

A year ago, AI coding tools worked alongside developers—suggesting the next line, completing a function, accelerating work that a human was already doing. Today, they’re writing entire modules and services independently, producing code that no human has reviewed line by line, built from components that no single person has fully mapped. And adoption is only accelerating: According to our recent AI Resilience Survey, 84% of organizations are now using AI to write, review, or suggest code.

Why Modern IT Incident Response Needs Social Sentiment Analysis

IT operations teams face an ongoing battle against alert fatigue. Despite running sophisticated telemetry and baseline Application Performance Monitoring, engineers are often bombarded with notifications that lead nowhere. Relying purely on internal dashboards creates a massive visibility gap, and when critical incidents slip through the cracks, the financial damage is swift and severe. To close this gap, DevOps professionals are increasingly looking beyond traditional server metrics and turning to a surprising source for early warning signals: public social sentiment.

PagerDuty agent app in GitHub

PagerDuty's agent app shows live incident state, incident history and change correlations inside GitHub so you can get context right within your PR without interrupting your flow. Automatically correlate incident data with recent commits and deployments to identify root causes, then generate fix PRs with proper incident linking.#IncidentResponse.

PagerDuty agent app in GitHub: incident context where you already work

This blog post is part of PagerDuty’s ongoing series on how we’re helping customers navigate their journey toward autonomous operations. Read on to learn about the PagerDuty agent app in GitHub (Early Access) and how it builds toward this vision. How many tabs do you have open right now? And how many more do you open the moment an incident hits? Context switching during incident response is one of the most persistent sources of toil in engineering.

AI Orchestrations: Your easy button for proactive operations

This blog post is part of PagerDuty’s ongoing series on how we’re helping customers navigate their journey towards autonomous operations. Read on to learn about how AI Orchestrations builds towards this vision. “We should automate this.” Sound familiar? For many operations teams, that sentence never becomes action. Building event orchestration rules demands deep platform expertise, time no one has, and the ability to spot which patterns in your data actually matter.

5 Reasons OnPage Tops the Best HIPAA Messaging Apps List

Choosing a HIPAA-compliant messaging app is rarely about security alone. Healthcare teams need messages that get read, on-call schedules that route to the right provider, and reliability that holds up at 3 a.m. Most apps clear the encryption bar. Fewer guarantee a missed page never happens. Or that critical alerts from medical systems and urgent after-hours calls from a discharged patient reach the right on-call staff.

AI Is Not a Switch: The Real Path to AI-First Operations

Organizations are no longer asking whether to adopt AI; that question is settled. The focus now is on reaching a point where AI is doing meaningful operational work—or as the industry calls it, being “AI-first.” But being “AI-first” isn’t binary. You don’t go from zero AI to meaningful autonomy by flipping a switch. In reality, getting there means moving through distinct stages.