Operations | Monitoring | ITSM | DevOps | Cloud

Incident.io

The Debrief: Incident management for data teams

If you're on a data team, have you ever considered using an incident management tool to respond to pipeline issues? If the answer is no, then you might want to check out this episode. Here, we chat with Jack, Data Analyst at incident.io, to better understand why data teams can—and should—look to incident management tools like incident.io to manage issues. We chat about: Read Jack's blog post about incident management for data teams.

Reducing the burden of incident response on your teams

In this webinar, a panel of engineering leaders, including Chris Evans, CPO at incident.io, share how they reduce the burden of incident response for their teams. They advocate for a culture of shared responsibility across the board, offering practical strategies to educate the business about engineering practices during the chaos of an outage.

Engineering nits: Building a Storybook for Slack Block Kit

We care a lot about the pace of shipping at incident.io: moving fast is a fundamental part of our company culture, and out-pacing your competition is one of the best ways we know to win. In engineering teams, one way to ship fast is to invest in tools that make your team more productive. We've become good at identifying small pains and frustrations that slow us down over time and – after surfacing them to the rest of the team – find solutions for them.

Your incident declaration form is (probably) too long: The power of concise reporting

It’s 10am, your coffee is ready and piping hot, and you have just been paged. Looks like is down, and customers are starting to notice. With no time to lose, you open up your organization’s incident declaration form and you spend the next thirty minutes filling out the fifteen required fields, while the incident grows bigger and more complex, messages are rolling in, and your coffee grows cold.

Resilience Engineering in 2024: Challenges, Trends, & Priorities

Is your organization ready to fortify, expand, and cultivate a robust resilience engineering culture in 2024? In this webinar Chris Evans (Co founder & Chief Product Officer, incident.io) and Courtney Nash (Internet Incident Librarian, The VOID) will delve into crucial considerations and top priorities for improving your organization’s ability to build safer and more reliable complex systems while unlocking insights for shaping your plans for 2024 and beyond.

Should data teams consider incident management tools to respond to pipeline issues?

Data teams are adopting more processes and tools that align with software engineering, and from talks at the dbt Coalesce conference in 2023, there’s clearly a big push towards adopting software engineering practices at enterprise scale companies. At the moment, there are a lot of tools in the data space for identifying errors in data pipelines, but no tools for responding to these errors, such as coordinating fixes. This is exactly where an incident management platform makes sense to implement.

Incident management really can be for everyone

Incident management tools are often built for engineers to solve technical issues. On the surface, thinking of incident management as an engineering problem makes sense, and it’s an approach that’s widely used by many organizations from small startups to large enterprises. When there's a problem like a checkout page failure or a server crash, it’s natural for engineers to spring into action, declaring and resolving these incidents.

The price of building your own incident management tool is not what it seems.

Build or buy? An age-old decision that gets made dozens of times a year. It’s quite possibly one of the most important decisions you make as an company. It impacts roadmaps, productivity, team structure, and customer satisfaction (you know, just a few little things). There are a lot of factors to consider, one of the most prominent being cost. So, what exactly are the costs you need to consider when building your own incident management solution?

Learning Flows: Bringing consistency to your post incident processes

To get the most out of your incident response processes, consistency is crucial. The more predictable you can be whenever issues crop up, whether a small bug or a major outage, the quicker and more confidently you can respond. In practice, incident response is equal parts knowing how to actually resolve the issue and having the confidence that the processes in place will help get you through without added stress.