Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Communicating to Users During Incidents

Imagine you're having a regular day at work, opening up your browser, double checking something for a client in that web app your team built for them, when suddenly, you see this screen: You hit refresh a few times, just to be sure. Nope. Still down. What happens next depends on how well your team has planned for incidents like this (some folks call it unplanned downtime).

Improving your team's on-call experience

Your engineers probably dislike going on-call for your services. Some might even dread it. It doesn't have to be this way. With a few changes to how your team runs on-call, and deals with recurring alerts, you might find your team starting to enjoy it (as unimaginable as that sounds). I wrote this article as a follow-up to Getting over on-call anxiety.

Getting over on-call anxiety

You've joined a company, or worked there a little while, and you've just now realised that you'll have to do on-call. You feel like you don't know much about how everything fits together, how are you supposed to fix it at 2am when you get paged? So you're a little nervous. Understandable. Here are a few tips to help you become less nervous.

What is MTTR? Resolve incidents faster through ops, alerting and documentation

When downtime strikes any distributed software deployment or platform, it’s all hands on deck until the lights are green and service is restored. This process, from the recognition of a problem to a deployed solution, has most commonly been defined as MTTR — mean time to resolution. In just the last few years, DevOps and site reliability (SRE) professionals have developed sophisticated new models for how they work and audit their successes.

A single pane of glass for automatic incident response for Bridgeport Public School District

“I have been doing this for 20+ years and have been using literally every product out there. Derdack is unique at how issues are addressed and communicated out because of the seamless integration, maturity and flexibility of the platform. Working with Derdack has been a game changer for us and helped us to do more with less.” Jeff Postolowski, Director Information Technology Services, Bridgeport Public School District

Get Started with Playbooks Permissions

The goal of Mattermost Playbooks is to help teams consistently orchestrate any and all recurring workflows. A Playbook is a prescribed, repeatable process that a team has agreed on and formalized as a collaborative checklist saved on their Mattermost server. We at Mattermost use Playbooks for incident collaboration, customer onboarding, and product releases, along with many other complex processes.

Sponsored Post

What is Incident Response?

When a service is down, a system is failing, or a security issue is in the midst of occurring, organizations need a solid incident response process to get up and running again. Incident response isn't just for high severity, lights out incidents either; if you've rebooted your computer to fix a problem, you've been an incident responder yourself! Incidents happen, and any successful organization knows that instead of pretending that one day nothing will ever go wrong, it's far more useful to develop a comprehensive operational response plan. And to do so, you need to know what incident response is! Let's get into it.

Improve Incident Response by Getting Control of Your (Unintelligent) Swarm

Incidents happen. Things go wrong. Systems fail. Sometimes they fail in unexpected and dramatic ways that create Major Incidents. PagerDuty makes a very specific distinction between an incident and an Incident. Your organization may also make such a distinction. Determining if an incident is major or not can come down to a number of factors, or a specific combination of factors, like the number of services affected, the customer impact, and the duration of the incident.

Achieving Maximum Patient Satisfaction Through Effective Clinical Communications

Judit Sharon, CEO and founder of OnPage Corporation, sits down with Healthcare Innovation to discuss how advanced, effective clinical communication systems help teams achieve ultimate patient satisfaction. How has the landscape around time-sensitive communications between and among clinicians and others in patient care delivery, evolved in the past few years?