Operations | Monitoring | ITSM | DevOps | Cloud

DevEx matters for coding agents, too

The speed at which you can go from making a change in your code, to understanding if it actually works, has long been a popular topic of discussion (and often, humour) for engineers. This remains true in a world with AI. Developer experience isn't just important for humans anymore. Those agents we're all using hundreds of times a day? Feedback cycles matter just as much for them, if not more.

Stop choosing between fast incident response and secure access

Every production system will eventually break. It's not pessimism, it's just reality. That's why engineers go on-call, and why companies invest heavily in incident response tooling. But here's the problem: the moment an engineer goes on call, they typically need elevated access to production systems, databases, and sensitive customer data. And that elevated access? It's often permanent, overly broad, and a security nightmare waiting to happen.

Bloom filters: the niche trick behind a 16× faster API

This post is a deep dive into how we improved the P95 latency of an API endpoint from 5s to 0.3s using a niche little computer science trick called a bloom filter. We’ll cover why the endpoint was slow, the options we considered to make it fast and how we decided between them, and how it all works under the hood.

Service disruption on October 20, 2025

When the internet goes down, our primary job is to help everyone get back up, as fast as possible. Of the almost half a million incidents we've helped our customers solve, there are some which stand out for both their scale and impact. One of these happened on Monday, October 20, when AWS had a widely covered major outage in their us-east-1 region, from 07:11 to 10:53 UTC. We’re hosted in multiple regions of Google Cloud and so the majority of our product was unaffected by the outage.

Recapping SEV0 San Francisco 2025

Earlier this week, we gathered in San Francisco for our second SEV0—almost a year after our very first event. SEV0 has always been about shining a light on the biggest challenges (and opportunities) in incident response. Last year, we were still talking about the fundamentals: blameless culture, strong processes, and lessons from the best in reliability. This year felt different. AI has moved from background noise to front and center in every conversation, every team, everywhere.

Impact review: Scribe under the microscope

In December 2024 we launched Scribe to help responders never miss a detail from their incident calls. By automatically transcribing calls and highlighting key information, Scribe eliminates manual note-taking, reduces time spent getting up to speed, and preserves valuable context for post-incident analysis. The feature quickly gained popularity among our customers, but with success came an influx of requests for bug fixes, extra functionality, and wider call platform support.

Using Claude to power up your onboarding

I joined incident.io about ten weeks ago, having been in my previous role for four and a half years. Being a new starter was an unusual feeling for me, and there's been a huge amount to learn; but by lunch on my second day (!) I had started shipping value to our customers. A large part of hitting the ground running has been having a colleague alongside me, who I can pester with questions, who doesn’t get offended when I write in all capitals, and often praises me for being absolutely right!

Ready, steady, goa: our API setup

At incident.io, speed is essential. Our product is growing faster than ever; in scope, range of features and the number of people contributing to it. In the early days, when you’re a small startup with just a few hundred endpoints, a basic API setup gets you by. But as things scale, you need to make creating endpoints easy, fast, and reliable.

Breaking through the Senior Engineer ceiling

You’ve made it to Senior engineer. Now what? You’re now staring at the next level, Staff typically, sometimes Principal, or whatever your company calls it. The path feels murky. Your manager gives you feedback like “show more technical leadership” or “think bigger picture”, but what does that actually mean day-to-day? I’ve been there. I’ve also been on the other side, helping engineers grow through whatever explicit (or implicit) levels a company has.

Vibe coding with the incident.io API

Many, many years ago, I was a computer science major at the University of Illinois, hoping someday I’d be able to write code for a living. I started my career in QA hoping to learn the ins and outs of software development. But it turns out I wasn’t very good at coding. I was just good enough to get a role as a sales engineer, where all I had to do was write code that could hold together for 30 minutes in a demo.