Operations | Monitoring | ITSM | DevOps | Cloud

Introducing Runner Replicas: Scalable, Reliable Automation for Modern Ops

When you’re responsible for the reliability of complex systems, the execution layer of your automation is not something you want to think about—it should just work. Whether you’re deploying code, patching servers, or responding to an incident at 3 a.m., your automation engine should be as resilient and scalable as the infrastructure it’s operating on.

The Next Wave of Automation Makes More Room for Humans

When a system goes down, the impact isn’t just technical. It’s the people in the center of it who adapt, improvise, apply their judgment, and keep the business moving forward. I’ve worked in operations for more than 25 years, and one thing I’ve learned is that in any system, it’s the humans who are the truly resilient part.

Demo Roundups! Breaking the MTTR Bottleneck: Automating Diagnostics for Modern Incident Response

Discover how PagerDuty Automation eliminates the manual triage bottleneck that's slowing down your incident response. In this demo, you'll see how automating diagnostics can compress resolution times from hours to minutes by instantly analyzing your environment, correlating events across systems, and identifying root causes with transparent AI reasoning.

PagerDuty Joins Glean's AI Ecosystem: Unlocking More Seamless Incident Management

Today, we announced that PagerDuty is now officially part of the Glean MCP Directory! This partnership brings together two leaders in AI-powered productivity and operations, making it easier than ever for organizations to connect PagerDuty’s incident data directly to any AI tool or agent in their stack through the standardized Model Context Protocol (MCP). PagerDuty is the first (and currently only) incident management partner that is available via Glean’s AI ecosystem.

Agentic AI Becomes Essential: Why Adoption Is Accelerating and What Comes Next

The cautious optimism business leaders held towards AI agents has evolved into more widespread enthusiasm. In our last survey from April 2025, just over half (51%) of companies had deployed AI agents in their organization. Six months later, 75% of companies are deploying more than one agent, according to PagerDuty’s latest research.

Automate or Elevate? 5 Steps to Build an AI-Powered Incident Playbook

Modern development tools, CI/CD infrastructure, and AI have accelerated the pace at which companies release software. This speed supports innovation, but it also increases complexity and the chance of something breaking in ways that aren’t immediately obvious. Teams now deal with more operational data, complex failure patterns, and systems where a small configuration change can ripple across dozens of microservices.

You Don't Need a Five-Year AI Plan. You Need a Five-Week One.

In my travels, I constantly hear about plans that promise to “unlock the full power of AI” down the road. The usual advice is to start small with a few pilots, then gradually scale up from there. It looks good on paper, but in practice, it becomes a months-long slog of one-off experiments that burn a lot of capital, but usually generate little impact on their own.

How to Choose Incident Management Software

Choosing the right incident management software can make or break your organization’s operational resilience. Modern IT environments are growing complex, and so are customer expectations for always-on services. Having robust incident management capabilities isn’t just nice to have, it’s essential for business continuity.