Latest Posts

Keeping it boring: the incident.io technology stack

Feb 26, 2026 By Article In Incident.io

At incident.io we run a deliberately simple technology stack. Keeping things boring has allowed us to scale from a few hundred customers to several thousand, while having only two platform engineers. In this post I'll walk through the stack, explain some of the choices we've made, and touch on the challenges we're facing as we grow.

Read Post

Incident.io

Read more about Keeping it boring: the incident.io technology stack

Secure access at the speed of incident response

Feb 24, 2026 By Article In Incident.io

Picture this: it's 2am, your pager goes off, and you're staring at a production database that's on fire. You know exactly what's wrong. You know exactly how to fix it. But you can't touch anything because you're waiting on someone to approve your access request. Meanwhile, your customers are down, your SLAs are bleeding out, and you're refreshing Slack hoping someone in security is awake to click "approve." This is the incident response tax that too many teams pay.

Read Post

Incident.io

Read more about Secure access at the speed of incident response

Everything you need to know about ITIL 5, AI and incident management

Feb 1, 2026 By Article In Incident.io

ITIL 5 launched in January 2026, and for the first time in the framework's 40-year history, AI governance is front and center. If you're running incident management, on-call rotations, or building operational tooling, this matters: the gap between AI adoption and AI governance is about to become a compliance and operational risk issue. I’m not usually a big ITIL fan, but this guidance has some genuinely useful framing and questions.

Read Post

Incident.io

Read more about Everything you need to know about ITIL 5, AI and incident management

DevEx matters for coding agents, too

Dec 19, 2025 By Article In Incident.io

The speed at which you can go from making a change in your code, to understanding if it actually works, has long been a popular topic of discussion (and often, humour) for engineers. This remains true in a world with AI. Developer experience isn't just important for humans anymore. Those agents we're all using hundreds of times a day? Feedback cycles matter just as much for them, if not more.

Read Post

Incident.io

Read more about DevEx matters for coding agents, too

Stop choosing between fast incident response and secure access

Dec 1, 2025 By Article In Incident.io

Every production system will eventually break. It's not pessimism, it's just reality. That's why engineers go on-call, and why companies invest heavily in incident response tooling. But here's the problem: the moment an engineer goes on call, they typically need elevated access to production systems, databases, and sensitive customer data. And that elevated access? It's often permanent, overly broad, and a security nightmare waiting to happen.

Read Post

Incident.io

Read more about Stop choosing between fast incident response and secure access

Bloom filters: the niche trick behind a 16× faster API

Nov 14, 2025 By Engineering In Incident.io

This post is a deep dive into how we improved the P95 latency of an API endpoint from 5s to 0.3s using a niche little computer science trick called a bloom filter. We’ll cover why the endpoint was slow, the options we considered to make it fast and how we decided between them, and how it all works under the hood.

Read Post

Incident.io

Read more about Bloom filters: the niche trick behind a 16× faster API

Service disruption on October 20, 2025

Oct 22, 2025 By Article In Incident.io

When the internet goes down, our primary job is to help everyone get back up, as fast as possible. Of the almost half a million incidents we've helped our customers solve, there are some which stand out for both their scale and impact. One of these happened on Monday, October 20, when AWS had a widely covered major outage in their us-east-1 region, from 07:11 to 10:53 UTC. We’re hosted in multiple regions of Google Cloud and so the majority of our product was unaffected by the outage.

Read Post

Incident.io

Read more about Service disruption on October 20, 2025

Recapping SEV0 San Francisco 2025

Sep 30, 2025 By Article In Incident.io

Earlier this week, we gathered in San Francisco for our second SEV0—almost a year after our very first event. SEV0 has always been about shining a light on the biggest challenges (and opportunities) in incident response. Last year, we were still talking about the fundamentals: blameless culture, strong processes, and lessons from the best in reliability. This year felt different. AI has moved from background noise to front and center in every conversation, every team, everywhere.

Read Post

Incident.io

Read more about Recapping SEV0 San Francisco 2025

Impact review: Scribe under the microscope

Aug 20, 2025 By Engineering In Incident.io

In December 2024 we launched Scribe to help responders never miss a detail from their incident calls. By automatically transcribing calls and highlighting key information, Scribe eliminates manual note-taking, reduces time spent getting up to speed, and preserves valuable context for post-incident analysis. The feature quickly gained popularity among our customers, but with success came an influx of requests for bug fixes, extra functionality, and wider call platform support.

Read Post

Incident.io

Read more about Impact review: Scribe under the microscope

Using Claude to power up your onboarding

Aug 14, 2025 By Article In Incident.io

I joined incident.io about ten weeks ago, having been in my previous role for four and a half years. Being a new starter was an unusual feeling for me, and there's been a huge amount to learn; but by lunch on my second day (!) I had started shipping value to our customers. A large part of hitting the ground running has been having a colleague alongside me, who I can pester with questions, who doesn’t get offended when I write in all capitals, and often praises me for being absolutely right!

Read Post

Incident.io

Read more about Using Claude to power up your onboarding

Operations | Monitoring | ITSM | DevOps | Cloud

Keeping it boring: the incident.io technology stack

Secure access at the speed of incident response

Everything you need to know about ITIL 5, AI and incident management

DevEx matters for coding agents, too

Stop choosing between fast incident response and secure access

Bloom filters: the niche trick behind a 16× faster API

Service disruption on October 20, 2025

Recapping SEV0 San Francisco 2025

Impact review: Scribe under the microscope

Using Claude to power up your onboarding

Monthly Archive

Follow Us