Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

The Debrief: incident.io, say hello to AI

This week was a particularly exciting one for us at incident.io. We launched not one, not two, but four AI-powered features to help folks get the most out of their incidents. In this episode of The Debrief, we sit down with Ed Dean, Product Analyst, and Charlie Revett, Product Engineer, to talk through all of these features and discuss how they're already making a measurable impact. You'll also hear them talk about: You can learn more about our AI features here.

Terraform Time | Distribute PagerDuty config utilising Terraform Remote State

We'll explore how to distribute PagerDuty configuration between multiple repositories leveraging Terraform Remote State feature. You will be able to access the code written during this Terraform Time episode in the following Github repository.

The alert fatigue dilemma: A call for change in how we manage on-call

Once the unsung heroes of the digital realm, engineers are now caught in a cycle of perpetual interruptions thanks to alerting systems that haven't kept pace with evolving needs. A constant stream of notifications has turned on-call duty into a source of frustration, stress, and poor work-life balance. In 2021, 83% percent of software engineers surveyed reported feelings of burnout from high workloads, inefficient processes, and unclear goals and targets.

Does Every Incident Need a Retrospective? Here's What the Experts Have to Say

Every quarter, we host a roundtable discussion centered around the challenges encountered by incident responders at the world’s leading organizations. These discussions are lightly facilitated and vendor-agnostic, with a carefully curated group of experts. Everyone brings their own unique perspective and experience to the group as we dive deep into the real-world challenges incident responders are facing today.

Never miss machines malfunctioning with ilert integration for Tulip

Downtime costs money. That's why an effective incident management system is crucial. We're excited to announce our new partnership with Tulip to help manufacturers manage incidents better. This integration is an important advancement for complex production processes that require an in-depth operational strategy.

Mastering incident resolution through Root Cause Changes

Discover a new way to handle incident resolution with our Root Cause Changes (RCC) feature. This tool optimizes incident management by linking incidents with relevant changes, resulting in a significant reduction in resolution time and an overall improvement in operational efficiency. Explore the world of incident resolution with our advanced RCC feature and unlock its benefits.

StatusCast : Conquer the Storm

Embark on a journey to conquer the storm with StatusCast! Watch our latest video to discover how our powerful incident communication and status page solutions empower you to navigate through challenges seamlessly. Unleash the potential to communicate effectively during disruptions and emerge stronger. Don't miss out—watch now and revolutionize your incident management game!

Lessons learned from building our first AI product

Since the advent of ChatGPT, companies have been racing to build AI features into their product. Previously, if you wanted AI features you needed to hire a team of specialists to build machine learning models in-house. But now that OpenAI’s models are an API call away, the investment required to build shiny AI has never been lower. We were one of those companies. Here’s our journey to building our first AI feature, and some practical advice if you’ll be doing the same.

Incident Response Plans: The Complete Guide To Creating & Maintaining IRPs

Speedily minimizing the negative impact of an information security incident is a fundamental element of information security management. The risks — loss of credibility in the eyes of users and other stakeholders, loss of business revenue and critical data, potential regulatory penalties — can significantly jeopardize your organization’s mission and objectives.