Manhattan, NY, USA
Nov 11, 2019 | By Bobby Tables
We’re on a mission to make responding to incidents a bit less chaotic. One of the best features we offer (we’re definitely not biased, no way) is a simple way to define how a severity gets determined when you open an incident. We call it the severity matrix, and today it has a new look. Previously, we had a preset list of conditions and impact that allowed you to pick a severity that matched them.
Oct 17, 2019 | By Bobby Tables
Since the beginning, we’ve wanted to make it faster, easier, and even a joy to respond to incidents. We’ve had the typical components of incident response for a while, but orchestrating them together was a manual task by our users. Today we’re marrying together all the features already available in our incident response tool into our newest release: Runbooks.
Oct 9, 2019 | By Daniel
One of the most common complaints we hear from operations and site reliability engineers is about the quality of life impacts and the resulting stress imposed by their on-call responsibilities. Most of us are already aware that a proper on-call rotation is critical to our engineering organization’s health in terms of both immediate incident response and long-term sustainable growth.
Sep 22, 2019 | By Bobby Tables
I was reminiscing about an incident that happened at a past job with an old co-worker. You know the one, the one where you installed a library that makes some task of yours simple, only to reveal the library makes things worse. This incident in particular involved the way that images served out of our Ruby on Rails application, and the library that made it possible to “easily resize before serving” them.
Sep 16, 2019 | By Daniel
One of the most common reasons for system failures is changes to the underlying infrastructure. Amazon CloudTrail does a great job of recording when actions are taken but a lot of organizations don’t take advantage of it. FireHydrant now includes this data, giving you visibility into changes to your infrastructure while you’re investigating an incident.