Incident.io

Why I like discussing actions items in incident reviews

Oct 7, 2024 By Chris Evans In Incident.io

Are incident reviews about learning or tracking actions? This question has sparked recent debate in incident management circles, including in my recent panel at SEV0 and in Lorin Hochstein’s post. Should the goal of an incident review be learning, or should it focus on tracking actionable improvements? When is the right time to discuss actions, and are they picked up just to make us feel better? From my experience, learning from incidents and identifying actions are inseparable.

Read Post

Incident.io

Read more about Why I like discussing actions items in incident reviews

incident.io is best in class for momentum, relationships and enterprise adoption

Oct 1, 2024 By incident.io In Incident.io

Trust doesn’t just happen overnight. For us at incident.io, it’s been a journey—one that’s focused on people just as much as the product. From the start, we knew that building great incident management software wasn’t just about creating features and functionality. It was about building relationships, understanding our users, and truly being there for them when it matters most. Our focus has always been to help teams manage incidents better.

Read Post

Incident.io

Read more about incident.io is best in class for momentum, relationships and enterprise adoption

What does SLO stand for? A complete guide to Service Level Objectives (SLOs)

Sep 12, 2024 By Kate Bernacchi-Sass In Incident.io

The world of tech is full of acronyms. SLOs are one of those that everyone talks about, but maybe not everyone fully gets. Whether you're nodding along in meetings or just hearing “SLO” for the first time, we’ve got you covered. In this post, we’ll break down what Service Level Objectives (SLOs) actually are, why they matter, and how they can help keep your systems (and your sanity) in check.

Read Post

Incident.io

Read more about What does SLO stand for? A complete guide to Service Level Objectives (SLOs)

The ultimate guide to on-call schedules

Sep 12, 2024 By Chris Evans In Incident.io

An Ultimate Guide to on-call schedules? You might think this sounds overly grandiose for what’s essentially putting people into a list and rotating through them. But you’d be flat-out wrong. Getting your on-call setup correct is as real and as important as it gets, and getting things wrong can lead to prolonged incidents, burnt out employees, and damaged company reputation.

Read Post

Incident.io

Read more about The ultimate guide to on-call schedules

Data quality testing

Sep 4, 2024 By Lambert Le Manh In Incident.io

Data quality testing is a subset of data observability. It is the process of evaluating data to ensure it meets the necessary standards of accuracy, consistency, completeness, and reliability before it is used in business operations or analytics. This involves validating data against predefined rules and criteria, such as checking for duplicates, verifying data formats, ensuring data integrity across systems, and confirming that all required fields are populated.

Read Post

Incident.io

Read more about Data quality testing

A new era for Catalog

Aug 28, 2024 By Charlie Kingston In Incident.io

Last year, we released Catalog—the connected map of everything in your organization. Catalog was built with the aim of tackling one of the most painful parts of incident response: contextualizing problems and understanding their place within your organization.

Read Post

Incident.io

Read more about A new era for Catalog

Building On-call: Our observability strategy

Aug 22, 2024 By Martha Lambert In Incident.io

At incident.io, we run an on-call product. Our customers need to be sure that when their systems go wrong, we’ll tell them about it—high availability is a core requirement for us. To achieve the level of reliability that’s essential to our customers, excellent observability (o11y) is one of the most important tools in our belt. When done right, observability improves your product experience from two angles.

Read Post

Incident.io

Read more about Building On-call: Our observability strategy

Introducing: incident.io for Microsoft Teams

Aug 13, 2024 By Ed Dean In Incident.io

There’s a major outage. Support tickets are mounting. Everybody from engineering to legal is scrambling for information. You have more Teams notifications clamouring for attention than you do minutes to address them, and it’s hard to know where to begin. What comes next is a balancing act—mitigating the impact, updating colleagues, managing action items, or updating a status page that will be seen by millions.

Read Post

Incident.io

Read more about Introducing: incident.io for Microsoft Teams

Building On-call: Continually testing with smoke tests

Aug 9, 2024 By Rory Malcolm In Incident.io

With the release of On-call, our system’s reliability had to be solid from the outset. Our customers have high expectations of a paging product—and internally, we would not be comfortable with releasing something that we weren’t sure would perform under pressure. While our earlier product, Response, was the core of a customer’s incident response process after an incident was detected, we’re now the first notification an engineer gets when something’s wrong.

Read Post

Incident.io

Read more about Building On-call: Continually testing with smoke tests

Redefining incident management: the power and pitfalls of AI

Jul 31, 2024 By Incident.io In Incident.io

Like it or not, AI is having a monumental impact on our lives. Most of the products we engage with today have AI features and functionality, aimed at assisting or completely replacing the actions normally taken by humans. When it comes to incidents, we’re firm believers of accelerating human actions, and believe the risk of over-automation far outweighs the benefits. In this live event we’ll dig a little deeper on why, as we cover the power and pitfalls of AI.

View Video

Incident.io

Read more about Redefining incident management: the power and pitfalls of AI

Operations | Monitoring | ITSM | DevOps | Cloud

Incident.io

Why I like discussing actions items in incident reviews

incident.io is best in class for momentum, relationships and enterprise adoption

What does SLO stand for? A complete guide to Service Level Objectives (SLOs)

The ultimate guide to on-call schedules

Data quality testing

A new era for Catalog

Building On-call: Our observability strategy

Introducing: incident.io for Microsoft Teams

Building On-call: Continually testing with smoke tests

Redefining incident management: the power and pitfalls of AI

Monthly Archive

Follow Us