|
By Stephen Whitworth
I want to walk you through how incident management has evolved, drawing from real data and the experiences of some of the most sophisticated tech organizations out there. I'll also introduce you to a framework we’ve developed at incident.io: the Incident Maturity Model. This framework is the result of thousands of conversations with companies and provides a clear roadmap to help your organization improve its incident management practices—no matter where you're starting from.
|
By Chris Evans
On August 28th, 2023—right in the middle of a UK public holiday—an issue with the UK’s air traffic control systems caused chaos across the country. The culprit? An entirely valid flight plan that hit an edge case in the processing software, partly because it contained a pair of duplicate airport codes.
|
By Lawrence Jones
Picture this: your alerting system needs to tell you it's broken. Sounds like a paradox, right? Yet that’s exactly the situation we face as an incident management company. We believe strongly in using our own products - after all, if we don’t trust ourselves to be there when it matters most, why should the thousands of engineers who rely on us every day? However, this poses an obvious challenge.
|
By Martha Lambert
At incident.io, we run on a monolith. This brings a whole load of benefits that we don’t want to give up any time soon. We don’t have to worry about the speed of internal network requests, complex deployments, or optimizing work that touches multiple services. This blog post isn’t about the relative benefits of monoliths though (but we’ve written more about that here if you are interested)! Ownership in monoliths is tricky.
|
By Lambert Le Manh
As a provider of incident management software, we at incident.io manage sensitive data regarding our customers. This includes Personally Identifiable Information (PII) about their employees, such as emails, first names, and last names, as well as confidential details regarding customer incidents, such as names and summaries. Consequently, we approach the management of this data with a great deal of care.
|
By Jack Colsey
We've written several times about our data stack here incident, but never about our underlying data warehouse and the design principles behind it. This blog post will run through the high-level structure of our data warehouse and then will go in-depth into the underlying layers.
|
By Pete Hamilton
Writing a meaningful update for customers every week has been held sacred at incident.io since we started the company. We've written over 200 of them in the past 4 years, and we recently celebrated going 2 years straight without missing a single a single week The numbers themselves are not the goal, but the consistency of this habit and what it represents for our customers and our team is very real, and special to me.
|
By Sam Starling
With every job I have, I come across a new observability tool that I can’t live without. It’s also something that’s a superpower for us at incident.io: we often detect bugs faster than our customers can report them to us. A couple of jobs ago, that was Prometheus. In my previous job, it was the fact that we retained all of our logs for 30 days, and had them available to search using the Elastic stack (back then, the ELK stack: Elasticsearch, Logstash, and Kibana).
|
By Milly Leadley
Indexes can make a world of difference to performance in Postgres, but it’s not always obvious when you’ve written a query that could do with an index. Here we’ll cover.
|
By Charlie Kingston
The Digital Finance Strategy is a European directive that aims to support and develop digital finance in Europe while maintaining financial stability and consumer protection. There are three main components to the package: In this blog post, we’ll attempt to summarize the 113-page DORA proposal, highlighting how it will apply to incident management at financial entities. Side note: we also wrote a blog post about the other DORA, also known as the DevOps Research and Assessments.
|
By Incident.io
In this episode, we take a look back at 2024 at @incident-io — reflecting on the year’s personal milestones, company-wide changes, and how our product has evolved along the way. Of course, no reflection would be complete without a healthy dose of "banter". Join us as we wrap up the year with insights, laughs, and a lookahead to what's coming early 2025.
|
By Incident.io
This week, we show how you can manage large-scale incidents by breaking the work down into streams with their own Slack channels and calls.
|
By Incident.io
This week we walk through writing post-mortems in the app, from resolving the incident to building a comprehensive post-incident summary directly in-app.
|
By Incident.io
Watch Derek's full talk from SEV0 here: https://go.incident.io/a8xPaeB
|
By Incident.io
Like it or not, AI is having a monumental impact on our lives. Most of the products we engage with today have AI features and functionality, aimed at assisting or completely replacing the actions normally taken by humans. When it comes to incidents, we’re firm believers of accelerating human actions, and believe the risk of over-automation far outweighs the benefits. In this live event we’ll dig a little deeper on why, as we cover the power and pitfalls of AI.
- December 2024 (9)
- November 2024 (8)
- October 2024 (6)
- September 2024 (3)
- August 2024 (4)
- July 2024 (12)
- June 2024 (8)
- May 2024 (13)
- April 2024 (18)
- March 2024 (15)
- February 2024 (18)
- January 2024 (9)
- December 2023 (10)
- November 2023 (5)
- October 2023 (10)
- September 2023 (16)
- August 2023 (3)
- July 2023 (8)
- June 2023 (6)
- May 2023 (4)
- April 2023 (8)
- March 2023 (2)
- February 2023 (5)
- January 2023 (5)
- December 2022 (3)
- November 2022 (4)
- October 2022 (10)
- September 2022 (7)
- August 2022 (11)
- July 2022 (6)
- June 2022 (3)
- May 2022 (2)
- April 2022 (3)
- March 2022 (6)
- February 2022 (7)
- January 2022 (2)
- December 2021 (5)
- November 2021 (5)
- October 2021 (2)
Create, manage and resolve incidents directly in Slack. Leave the admin and reporting to us.
Improving your incident response, visibility, and ability to learn:
- Less faffing, more fixing: We take care of the admin during incidents, so you can save your brainpower for the decisions that matter.
- Divide and conquer: We make sure everyone’s role is clear, track who’s working on what, and help you escalate if you need extra help.
- Get up to speed, at speed: Get everyone on the same page from the moment they join the incident, and help stakeholders stay in the loop.
- Timelines, in no time: Constructing an incident timeline for review is important, but time consuming. We’ll build one for you in real-time, and keep it constantly up to date.
- Data and insights you can trust: You’ve already paid for your incidents. By surfacing the data you need to make decisions, we help you get your money’s worth.
Incident response for your whole organisation.