FireHydrant

3 Ways to Help CS and Engineering Work Better Together

Jan 15, 2020 By Anna Kelley In FireHydrant

As Engineering teams start spending more time and effort on incident response, they are usually focused on improving process with their specific team. We think there are additional benefits that can come from a holistic approach to improving incident response across your organization. In this post, we will explore how you can enable Engineering and Customer Success teams to work more effectively when an incident occurs.

Read Post

FireHydrant

Read more about 3 Ways to Help CS and Engineering Work Better Together

Severity Matrix Updates

Nov 11, 2019 By Bobby Tables In FireHydrant

We’re on a mission to make responding to incidents a bit less chaotic. One of the best features we offer (we’re definitely not biased, no way) is a simple way to define how a severity gets determined when you open an incident. We call it the severity matrix, and today it has a new look. Previously, we had a preset list of conditions and impact that allowed you to pick a severity that matched them.

Read Post

FireHydrant

Read more about Severity Matrix Updates

Announcing Runbooks

Oct 17, 2019 By Bobby Tables In FireHydrant

Since the beginning, we’ve wanted to make it faster, easier, and even a joy to respond to incidents. We’ve had the typical components of incident response for a while, but orchestrating them together was a manual task by our users. Today we’re marrying together all the features already available in our incident response tool into our newest release: Runbooks.

Read Post

FireHydrant

Read more about Announcing Runbooks

A single person on-call "rotation" is a critical vulnerability

Oct 9, 2019 By Daniel In FireHydrant

One of the most common complaints we hear from operations and site reliability engineers is about the quality of life impacts and the resulting stress imposed by their on-call responsibilities. Most of us are already aware that a proper on-call rotation is critical to our engineering organization’s health in terms of both immediate incident response and long-term sustainable growth.

Read Post

FireHydrant

Read more about A single person on-call "rotation" is a critical vulnerability

Open Source can be a silver bullet, but your application might be a werewolf

Sep 22, 2019 By Bobby Tables In FireHydrant

I was reminiscing about an incident that happened at a past job with an old co-worker. You know the one, the one where you installed a library that makes some task of yours simple, only to reveal the library makes things worse. This incident in particular involved the way that images served out of our Ruby on Rails application, and the library that made it possible to “easily resize before serving” them.

Read Post

FireHydrant

Read more about Open Source can be a silver bullet, but your application might be a werewolf

Announcing our AWS CloudTrail Integration

Sep 16, 2019 By Daniel In FireHydrant

One of the most common reasons for system failures is changes to the underlying infrastructure. Amazon CloudTrail does a great job of recording when actions are taken but a lot of organizations don’t take advantage of it. FireHydrant now includes this data, giving you visibility into changes to your infrastructure while you’re investigating an incident.

Read Post

FireHydrant

Read more about Announcing our AWS CloudTrail Integration

Dynamic Kubernetes Informers

Aug 28, 2019 By Bobby Tables In FireHydrant

In the past I’ve written about how to use informers in Kubernetes for particular resources, but what if you need to be able to receive events for any Kubernetes resource dynamically? Well, there’s a client-go package for that too. At FireHydrant, we recently updated our Kubernetes integration to watch changes for any resource you configure and I wanted to write down how we made it at a high level.

Read Post

FireHydrant

Read more about Dynamic Kubernetes Informers

Announcing our Statuspage.io integration

Aug 22, 2019 By Bobby Tables In FireHydrant

Ever go to a status page and it says everything is operational when it definitely isn’t? You refresh maddeningly thinking it might be you. You ponder if the bill for the internet has been paid. Then, as a last resort, you check Twitter only to discover hundreds of people are experiencing the same problem. This is common, and because of it, we’re happy to release out integration with Statuspage.io!

Read Post

FireHydrant

Read more about Announcing our Statuspage.io integration

3 Defensive Programming Techniques for Rails

Jul 29, 2019 By Bobby Tables In FireHydrant

Incidents happen all the time because of bad code deploys. You write some code that passes code review, it then is automatically shipped to production after a test suite passes, and BAM, an outage happens. This fairly common occurrence has ways to prevent it entirely. Using some simple ideas we can defend ourselves from the hidden mistakes that code reviews and chaos engineering sometimes won’t catch.

Read Post

FireHydrant

Read more about 3 Defensive Programming Techniques for Rails

Announcing Flare: Make opening incidents stress free

Jun 28, 2019 By Bobby Tables In FireHydrant

We’re launching a new feature today that allows anyone in your organization to kick off your incident response process with an appropriate severity level attached from Slack. Often people are afraid to open an incident or even share that they’re aware of something going wrong with your applications. When everything is important, nothing is important; users frequently overestimate the impact of an incident and assign an inappropriately high severity level.

Read Post

FireHydrant

Read more about Announcing Flare: Make opening incidents stress free

Operations | Monitoring | ITSM | DevOps | Cloud

FireHydrant

3 Ways to Help CS and Engineering Work Better Together

Severity Matrix Updates

Announcing Runbooks

A single person on-call "rotation" is a critical vulnerability

Open Source can be a silver bullet, but your application might be a werewolf

Announcing our AWS CloudTrail Integration

Dynamic Kubernetes Informers

Announcing our Statuspage.io integration

3 Defensive Programming Techniques for Rails

Announcing Flare: Make opening incidents stress free

Monthly Archive

Follow Us