Operations | Monitoring | ITSM | DevOps | Cloud

Incident Response

Automate incident response workflows with Eventarc and Datadog

Eventarc is a Google Cloud offering that ingests and routes events between GCP products, such as Cloud Run, Cloud Functions, and Pub/Sub, making it easy to build automated, event-driven workflows in complex environments. By taking care of event ingestion, delivery, authorization, and error handling, Eventarc reduces the development overhead that is required to build and maintain these workflows and helps you improve application resilience.

Anti-patterns in Incident Response that you should unlearn

It is important to invest time and effort in understanding why a system performs the way it does and how we can improve it. Companies continue with practices that yield successful results, but ignoring anti-patterns can be far worse than choosing rigid processes. In this blog we will explore anti-patterns in incident response and why you should unlearn those.

What's New: Updates to Incident Response, PagerDuty Process Automation, Integrations, and More!

Following another successful PagerDuty Summit, development continues across several areas of the product. We’re excited to announce a new set of updates and enhancements to the PagerDuty Operations Cloud. Recent updates from the product team include Incident Response, PagerDuty® Process Automation and PagerDuty® Runbook Automation, Partner Integrations & Ecosystem, as well as Community & Advocacy Events updates.

Our fully-redesigned incident response experience delivers a more intuitive workflow

Today we’re releasing fully redesigned Slack and Command Center experiences for FireHydrant so anyone on your team can intuitively navigate the incident response process — in the app or on the web. There are many things you can do ahead of an incident to help things run smoothly: design and document your process, automate predictable steps, train the team, and run drills.

Streamline incident response on a unified platform

Incident response is at the heart of great, or terrible, user experience. However, while tools have evolved, challenges – especially faced by L1 agents – still exist. The solution isn’t about getting more tools. It’s about establishing a unified platform across the IT service desk and the IT Operations teams that empowers L1 agents to collaborate across silos with L2 agents, subject matter experts, and DevOps personnel.

More Powerful than Ever: PagerDuty's Revamped Mobile App is Primed for Even Better Incident Response

2020 revolutionized how we work. Many went from full-time office work to 100% remote overnight. And now that in-office is once again on the horizon, companies are thinking of ways to continue to work flexibly. However, this comes with increased challenges, and a need for tools that match this working style. The PagerDuty mobile application is well recognized, with a 4.8 stars rating on the App Store and Google Play.

Words matter: incident management versus incident response

I recently published a couple of blog posts about what happens when you invest in a thoughtful incident management strategy and three first steps to take to do so. What I’m getting at in these posts is that we need a shift toward proactivity in the software operators community. I’d wager most of the world is responding to incidents as they happen, and nothing more.

Developing a Data Breach Incident Response Plan

With cybersecurity boundaries going beyond the traditional walls of an office and attack surfaces constantly expanding, data breaches are inevitable. Managing risks from data breaches requires organizations to develop a comprehensive incident response plan – an established guideline that facilitates incident detection, response and containment, and empowers cybersecurity analysts to secure a company’s digital asset.

How to Standardize Service Ownership at Scale for Improved Incident Response

Service ownership is a DevOps best practice where team members take responsibility for supporting the software they deliver at every stage of the development lifecycle. This level of ownership brings development teams much closer to their customers, the business, and the value being delivered. Service owners are the subject matter experts (SMEs) for their services – and in a service ownership model, they are also responsible for responding to any production issues.