Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

The Debrief: Build vs buy

Oct 11, 2023 By Incident.io In Incident.io

Almost every organization around will eventually face an important crossroad: should I build the tooling I need, or buy it? But more often that not, the decision to buy is the most sensible one that'll save you the most time, effort, and even money. But there are some edge cases where building can be the right choice. In this chat with Isaac, product engineer at incident.io, we dive into this nuanced debate and explain why buying is your best bet...most of the time.

View Video

Incident.io

Incident Management

Read more about The Debrief: Build vs buy

After Hours Alerting for ConnectWise

Oct 11, 2023 By SIGNL4 In SIGNL4

A short demo video on how to add After Hours Alerting with SIGNL4 to your ConnectWise PSA. We show you the complete workflow and what to keep in mind for seamless connectivity and targeted mobile alerting including duty scheduling for your teams.

View Video

SIGNL4

Read more about After Hours Alerting for ConnectWise

SLA vs. SLO vs. SLI: What's the Difference?

Oct 11, 2023 By Laura Clayton In Uptime Robot

When it comes to managing services effectively, terms like SLA, SLO, and SLI are often thrown around like confetti at a parade. They’re in meetings, in documents, and even in casual office conversations. But if you’re new to the field or simply haven’t had the chance to dig into these acronyms, they can feel like a bewildering alphabet soup. And they can’t be missing on an uptime monitoring blog such as ours! So, what do these terms really mean?

Read Post

Uptime Robot

Read more about SLA vs. SLO vs. SLI: What's the Difference?

A guide to post-mortem meetings and how we run them at incident.io

Oct 11, 2023 By Luis Gonzalez In Incident.io

You've just made it through a particularly tough incident. It was a short outage affecting a subset of customers, so not exactly the end of the world, but bad enough that it involved multiple people across a number of teams to resolve. Either way, the incident was well managed, and the dust has settled. Now what? Most guidance would say that putting together a post-mortem document is a good idea, given the severity of the incident. You've also done this, so what's next?

Read Post

Incident.io

Read more about A guide to post-mortem meetings and how we run them at incident.io

Introduction to ilert AI

Oct 10, 2023 By iLert In iLert

During the intensity of incident response, it is crucial to maintain concentration on resolving the problem promptly. At times, crafting a thorough and precise incident communication can be difficult, particularly when under pressure. This is where ilert's AI-powered incident communication feature becomes valuable.

View Video

iLert

Read more about Introduction to ilert AI

Three Ways to Better Appreciate your SREs and DevOps Engineers

Oct 10, 2023 By Emily Arnott In Blameless

DevOps engineers and Site Reliability Engineers are vitally important to the continued health of your product and business. We all know it’s true, and yet people in these roles often feel underappreciated and undervalued. This sort of work runs into the issue of “when process and infrastructure break, it gets shoved in the spotlight; but when everything works perfectly, no one notices.” ‍

Read Post

Blameless

Read more about Three Ways to Better Appreciate your SREs and DevOps Engineers

The Unplanned Show, Episode 16: Resiliency with Sam Newman

Oct 10, 2023 By PagerDuty In PagerDuty

When the author of Building Microservices (O'Reilly) tweets asking for a "plurality of views" on resiliency, I, for one, am intrigued. In this episode, we'll hear from Sam Newman about his latest thinking on resiliency.

View Video

PagerDuty

Incident Management

Read more about The Unplanned Show, Episode 16: Resiliency with Sam Newman

How AIOps modernizes CMDBs to drive accuracy and value

Oct 10, 2023 By Blair Sibille In BigPanda

Maintaining your Configuration Management Database’s (CMDB) accuracy, keeping it fully updated, and improving its performance is a frustrating and elusive goal for ITOps and IT leaders. Aiming for this ‘golden’ CMDB standard can feel like running on a treadmill where you’re putting in a lot of work, but remain as distant as ever from your goal. Can IT leaders ever catch up?

Read Post

BigPanda

Read more about How AIOps modernizes CMDBs to drive accuracy and value

Bridging the ITIL vs DevOps Mindset: CI/CD Best Practices for ITIL Organizations

Oct 9, 2023 By Elik Eizenberg In BigPanda

DevOps practices in software development have revolutionized the way updates are released. However, many companies entrenched in ITIL practices find it challenging to seamlessly integrate with the DevOps practice of Continuous Integration and Continuous Delivery/Deployment (CI/CD). This is because ITIL focuses on stability, which suits older systems, while DevOps is ideal for modern setups with its agile, automated practices.

Read Post

BigPanda

Read more about Bridging the ITIL vs DevOps Mindset: CI/CD Best Practices for ITIL Organizations

Revolutionizing your Grafana setup with intelligent alerting

Oct 9, 2023 By emily In SIGNL4

Once upon a time, in the bustling city of DataVille, lived a team of dedicated IT professionals tirelessly working to maintain the city’s digital heartbeat. Their mission was to ensure the smooth operation of their city’s digital infrastructure, which was not limited to the daytime operations but extended beyond business hours. They were the unsung heroes, the guardians of the city’s data. Their tool of choice? Grafana, a powerful open-source platform for observability.

Read Post