Operations | Monitoring | ITSM | DevOps | Cloud

How we're shipping faster with Claude Code and Git Worktrees

Four months ago, Claude Code was announced and we were requesting invites to its "Research Preview." Now? We've gone from no Claude Code to simultaneously running four or five Claude agents, each working on different features in parallel. It sounds chaotic, but it's been a natural progression as we've learned to trust AI more and as the tools have dramatically improved.

Beyond the code: Shipping faster with AI with Leo P.

We’re running a short mini-series on The Debrief podcast called Beyond the code, where we interview our engineers about what it’s really like to build at incident.io. In this episode, we chat with Product Engineer Leo about how we’re using AI tools like Claude Code to ship more product, more quickly.

From dashboard soup to observability lasagna: Building better layers

Let's be honest - observability can suck. Ever feel like you're swimming in dashboard soup? You know the feeling: tons of single-use dashboards, building new ones during every incident only to lose them in the chaos, and spending ages creating visualizations that no one ever looks at again. Even with all the right tools, something still feels off.

Beyond the code: On-call, Claude, and cinnamon buns with Leo P.

We’re running a short mini-series on The Debrief podcast called Beyond the code, where we interview our engineers about what it’s really like to build at incident.io. In this episode, we chat with Product Engineer Leo about her time building On-call, our favorite engineering tooling, and what makes our engineering culture as good as cinnamon buns.

Beyond the code: Coffee, copilots, and building AI with Rory M.

We’re running a short mini-series on The Debrief podcast called Beyond the code, where we interview our engineers about what it’s really like to build at incident.io. In this episode, Norberto Lopes and Rory Malcolm discuss Rory's journey as a product engineer at incident.io, focusing on his experiences in the AI team and the challenges of developing the AI investigations product. They explore the engineering culture at incident.io and the impact of AI on incident management.

Why you're (probably) doing service catalogs wrong

Service catalogs promise a lot of things: powerful automations, insights into your technology estate. But over the last few years, many of us have learned that setting up and maintaining a service catalog is really hard. Building out a catalog from a standing start can take months, or even years. Too many people get stuck in a chicken-and-egg situation, where you can’t deliver value because you don’t have the data in your catalog, and you can’t convince anyone to spend time helping you because the catalog doesn’t do anything yet.

Why we vibe coded a marketing campaign for Anthropic

Let’s start with the obvious: we’d like to have Anthropic as a customer. We greatly admire the work they are doing at the intersection of frontier models + safety. We use lots of different AI tooling at incident.io. We’re all-in at AI at incident.io, both to improve the productivity of our internal team and, more importantly, to provide our customers with superpowers in the form of an AI incident responder.

The EU AI Act and what it means for managing incidents

If you've been in earshot of tech leadership lately, you've probably heard the words 'EU,' 'AI,' and 'compliance' in conversation. The EU AI Act is officially upon us, and with it comes a whole new set of incident response and reporting requirements that might feel like a yet another bureaucratic set of requirements to worry about. But there's a different way to look at this legislation.

Pager fatigue: Making the invisible work visible

As much as you try to prevent it, your product will break sometimes. While you hope it would have the decency to do so while you are awake and already working, sometimes the product is inconsiderate and decides to break outside your office hours. Being woken up from a page at 3 am sucks, and being woken up again two hours later (when you get pinged for a follow-up issue you missed the first time) sucks even more.