|
By Lawrence Jones
We care a lot about the pace of shipping at incident.io: moving fast is a fundamental part of our company culture, and out-pacing your competition is one of the best ways we know to win. In engineering teams, one way to ship fast is to invest in tools that make your team more productive. We've become good at identifying small pains and frustrations that slow us down over time and – after surfacing them to the rest of the team – find solutions for them.
|
By Matilda Hultgren
It’s 10am, your coffee is ready and piping hot, and you have just been paged. Looks like is down, and customers are starting to notice. With no time to lose, you open up your organization’s incident declaration form and you spend the next thirty minutes filling out the fifteen required fields, while the incident grows bigger and more complex, messages are rolling in, and your coffee grows cold.
|
By Jack Colsey
Data teams are adopting more processes and tools that align with software engineering, and from talks at the dbt Coalesce conference in 2023, there’s clearly a big push towards adopting software engineering practices at enterprise scale companies. At the moment, there are a lot of tools in the data space for identifying errors in data pipelines, but no tools for responding to these errors, such as coordinating fixes. This is exactly where an incident management platform makes sense to implement.
|
By incident.io
Incident management tools are often built for engineers to solve technical issues. On the surface, thinking of incident management as an engineering problem makes sense, and it’s an approach that’s widely used by many organizations from small startups to large enterprises. When there's a problem like a checkout page failure or a server crash, it’s natural for engineers to spring into action, declaring and resolving these incidents.
|
By Asiya Gorelik
Build or buy? An age-old decision that gets made dozens of times a year. It’s quite possibly one of the most important decisions you make as an company. It impacts roadmaps, productivity, team structure, and customer satisfaction (you know, just a few little things). There are a lot of factors to consider, one of the most prominent being cost. So, what exactly are the costs you need to consider when building your own incident management solution?
|
By Luis Gonzalez
To get the most out of your incident response processes, consistency is crucial. The more predictable you can be whenever issues crop up, whether a small bug or a major outage, the quicker and more confidently you can respond. In practice, incident response is equal parts knowing how to actually resolve the issue and having the confidence that the processes in place will help get you through without added stress.
|
By Luis Gonzalez
You've just made it through a particularly tough incident. It was a short outage affecting a subset of customers, so not exactly the end of the world, but bad enough that it involved multiple people across a number of teams to resolve. Either way, the incident was well managed, and the dust has settled. Now what? Most guidance would say that putting together a post-mortem document is a good idea, given the severity of the incident. You've also done this, so what's next?
|
By incident.io
No one wants to be on the receiving end of the blame game—especially in the wake of a major incident. Sure, you know you were the one who made the final change that caused the incident. And hopefully, it was a small one that didn’t cause any SEV-1s. Still, the weight of knowing you caused something bad should be enough, right? Unfortunately, sometimes fingers get pointed, your name gets called, and suddenly, everyone knows that you’re the person who created more work for everyone.
|
By Luis Gonzalez
If you’re just starting out in the world of incident response, then you’ve probably come across the phrase “post-mortem” at least once or twice. And if you’re a seasoned incident responder, the phrase probably invokes mixed feelings. Just to clarify, here, we’re talking about post-mortem documents, not meetings. It’s a distinction we have to make since lots of teams use the phrase to refer to the meeting they have after an incident.
|
By Ben Wheatley
We recently moved our infrastructure fully into Google Cloud. Most things went very smoothly, but there was one issue we came across last week that just wouldn’t stop cropping up. What follows is a tale of rabbit holes, red herrings, table flips and (eventually) a very satisfying smoking gun. Grab a cuppa, and strap in. Our journey starts, fittingly, with an incident getting declared... 💥🚨
|
By Incident.io
In this webinar, a panel of engineering leaders, including Chris Evans, CPO at incident.io, share how they reduce the burden of incident response for their teams. They advocate for a culture of shared responsibility across the board, offering practical strategies to educate the business about engineering practices during the chaos of an outage.
|
By Incident.io
Almost every organization around will eventually face an important crossroad: should I build the tooling I need, or buy it? But more often that not, the decision to buy is the most sensible one that'll save you the most time, effort, and even money. But there are some edge cases where building can be the right choice. In this chat with Isaac, product engineer at incident.io, we dive into this nuanced debate and explain why buying is your best bet...most of the time.
|
By Incident.io
In this video, we talk through some of the nuances of incident management and problem management, why it's better to think of them as one, and how having more responsibility on teams to build and run their software and systems makes sense.
|
By Incident.io
Effective incident management involves not just responding to incidents but also detecting them early and preparing for future occurrences to minimize impact.
|
By Incident.io
In this snippet, Alon Levi, VP of Engineering at WorkOS, talks about why incident.io will be a key contributor in the growth of WorkOS.
|
By Incident.io
The video explores the difference between incident management and problem management in modern organizations. It describes a common scenario where operations teams focus on immediate fixes, like rebooting systems, without addressing the root causes. Once the immediate issue is resolved, these teams pass the incident report to the developers, who are then responsible for digging deeper to prevent future occurrences.
|
By Incident.io
In this snippet, Alon Levi of WorkOS highlights some of his favorite and most-used incident.io features: follow-ups and Workflows.
|
By Incident.io
In this snippet, Alon Levi, VP of Engineering at WorkOS, talks about how his team has gained the confidence to declare more incidents with incident.io.
|
By Incident.io
In this snippet, Alon Levi, VP of Engineering at WorkOS, talks about how non-technical responders have been able to confidently declare incidents thanks to incident.io's intuitive UI.
|
By Incident.io
In this snippet, Alon Levi, VP of Engineering at WorkOS, talks about the quality of support his team has received from incident.io.
- December 2023 (1)
- November 2023 (4)
- October 2023 (10)
- September 2023 (16)
- August 2023 (3)
- July 2023 (8)
- June 2023 (6)
- May 2023 (4)
- April 2023 (8)
- March 2023 (2)
- February 2023 (5)
- January 2023 (6)
- December 2022 (3)
- November 2022 (4)
- October 2022 (10)
- September 2022 (7)
- August 2022 (11)
- July 2022 (6)
- June 2022 (3)
- May 2022 (2)
- April 2022 (3)
- March 2022 (6)
- February 2022 (7)
- January 2022 (2)
- December 2021 (5)
- November 2021 (5)
- October 2021 (2)
Create, manage and resolve incidents directly in Slack. Leave the admin and reporting to us.
Improving your incident response, visibility, and ability to learn:
- Less faffing, more fixing: We take care of the admin during incidents, so you can save your brainpower for the decisions that matter.
- Divide and conquer: We make sure everyone’s role is clear, track who’s working on what, and help you escalate if you need extra help.
- Get up to speed, at speed: Get everyone on the same page from the moment they join the incident, and help stakeholders stay in the loop.
- Timelines, in no time: Constructing an incident timeline for review is important, but time consuming. We’ll build one for you in real-time, and keep it constantly up to date.
- Data and insights you can trust: You’ve already paid for your incidents. By surfacing the data you need to make decisions, we help you get your money’s worth.
Incident response for your whole organisation.