Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

There Are No Repeat Incidents

People seem to struggle with the idea that there are no repeat incidents. It is very easy and natural to see two distinct outages, with nearly identical failure modes, impacting the same components, and with no significant action items as repeat incidents. However, when we look at the responses and their variations, we can find key distinctions that shows the incidents as related, but not identical.

Fast Track Video Series: See a demonstration of BigPanda's Incident Intelligence and Automation Platform

BigPanda transforms millions of events into a small number of actionable alerts, no matter where they originate. How? Watch this video to learn more. The video shows how BigPanda allows you to normalize tag values across all tools, aiding event enrichment and correlation. The open integration manager then makes it easy to pre-process the event data helping to filter unwanted events from the feed. The filtering strips out duplicate and low-relevancy events and keeps them from cluttering up the console.

Bitrix24 + Squadcast Integration: Simplifying Alert Routing

Bitrix24 is a cloud-based business management and collaboration platform that provides a suite of tools for managing various business processes. If you use Bitrix24 for your collaboration and CRM requirements, you can integrate it with Squadcast, an end-to-end Incident Response tool, to route alerts such as creating a lead on Bitrix24 CRM or creating a task in Bitrix24 to Squadcast. ‍

MSP Guide to Navigating an Impending Recession

Amidst mounting pressure from macroeconomic headwinds, businesses must prepare for declining consumer spending, less investment, and tighter credit conditions to survive. Managed Service Providers (MSPs) play a valuable role in helping businesses to navigate upcoming economic downturns, from optimizing costs to providing scalable solutions.

How to Create a Runbook Template for DevOps (With Examples)

A DevOps runbook is a little like a recipe book. Instead of rules for cooking, it’s a compilation of rules and procedures designed to maintain software systems and other applications. The purpose of each runbook is to cross-educate your entire team with the same knowledge base and provide easy-to-follow instructions in time-sensitive situations like incidents. Runbook templates are guides outlining a standard for the documentation of operations and development.

Endtest + Squadcast Integration: Alert Routing Made Easy

Endtest is a low code test automation platform enabling organizations to efficiently build automated end-to-end tests for web and mobile applications. If you use Endtest for your test automation requirements, you can integrate it with Squadcast, an end-to-end Incident Response tool, to route detailed alerts from Endtest to the right users in Squadcast.

FireHydrant Private Incidents & Runbooks: more control for you, more security for your customers

Ensuring the privacy and security of sensitive information is crucial no matter your company's size or industry. So when an incident comes up that includes sensitive information — Personal Identifiable Information (PII), financial data, accidental data breaches, or legal matters requiring privileged communication — your response process might need a higher level of security and discretion.

Addressing the dynamic incident communication challenges of the enterprise with CommsFlow

At enterprise scale, effective flow of incident awareness requires sharing many distinct pieces of information with many unique stakeholders serving different roles in the organization at precise moments in time. The creation of these dynamic communications and their delivery is constantly put to the test by the pressure of knowing that for every minute the incident is allowed to persist, potentially hundreds or thousands of customer businesses are being harmed.

PagerDuty Operations Cloud Product Demo

Check out the PagerDuty Operations Cloud in action. It detects and analyzes event data from across your digital operations, automates infrastructure and workflows, and mobilizes the right team members to minimize the impact of disruptive events on customers, employees, and brand reputation. It will help your teams free up time, reduce operations costs so you can deliver seamless experiences for your customers.