Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Callable Flows - xMatters Support

In xMatters Flow Designer, you can use callable flows to initiate a major incident process in any workflow. Instead of including the same sequence of steps in each workflow, such as posting to a status page or opening a help desk ticket, you can build the sequence once as a separate workflow and then include that as a step in any of your workflows.

How to build organizational resilience: six proven steps

In today’s world, where natural disasters, terrorist threats, and cyberattacks are becoming increasingly common, business leaders must prioritize building resilience to ensure the long-term success of their organizations. However, the ways an organization can adapt, recover, and thrive in the face of adversity are often unclear.

Reduce MTTR and Address the Talent Gap with Logz.io Alert Recommendations

When our CEO and co-founder Tomer Levy delivered his “Observability is Broken” presentation at last year’s AWS re:Invent, he highlighted numerous challenges faced by today’s organizations as they seek to advance their observability practices. Of the six individual points that he noted, two specifically dealt with the current shortage of available engineering expertise, with another two focused on data overload.

Use incident cycle time to optimize your incident response process

Although the causes and solutions for incidents vary widely, most incidents follow a similar timeline from declaration to resolution. We call the period of time it takes to move from one phase or milestone of an incident to the next cycle time.

SIGNL4 Onboarding: 3rd Party Integration: Webhook & Email

The SIGNL4 Onboarding series walks users through the process's of SIGNL4 from Signup to Alerts to Settings. Todays video focuses on Scheduling users for duty shifts. Learn how to create an app inside of Signl4 to receive events from third party systems. Learn how to create an app and then receive events from those apps to create alerts. This video is packed with helpful tips to help you get the most out of your account.

Getting started with Squadcast's On-Call Scheduling

We understand that everyone values a simple and straightforward approach when it comes to setting up schedules. We at Squadcast are fully aware of the difficulties involved in creating an on-call schedule from scratch or migrating it to a new platform. Hence we have come up with a blog to assist you in seamlessly setting up your on-call schedule using Squadcast. Our goal is to provide guidance and support to make the process as effortless as possible for you.

Prometheus Blackbox Exporter: Guide & Tutorial

Prometheus is a favored open-source monitoring system that collects, stores, and queries metrics from various sources. In Prometheus, an exporter is a component that collects and exposes metrics in a format Prometheus can scrape. The Prometheus Blackbox Exporter is designed to monitor “black box” systems with internal workings that are not accessible by Prometheus. It sends HTTP, TCP, and ICMP requests to the external systems and measures their response times and statuses.

Data Shows Outage Time & Costs are Increasing - 3 Solutions You Should Consider

The Uptime Institute recently released its Annual Outage Analysis 2023 report. Overall, the report highlights the increasing costs, frequency, and duration of outages, the prominent role of cloud and digital services in outages, the shortcomings of service providers, and the need to address human error and management failures. It also underscores the ongoing challenges of handling failures in complex distributed architectures.