Operations | Monitoring | ITSM | DevOps | Cloud

FireHydrant

The why and how behind running incident response game days

In any high pressure situation, the key to fast action is preparedness. And that’s true when it comes to incidents, too. Documenting and training your team on your incident response processes is essential to ensuring a coordinated and efficient response effort. And training sessions, or game days, as they’re sometimes called, are one way to get everyone up to speed.

Create a service catalog that grows with you

When your incident response process is centered around a service catalog, responders are able to more quickly pinpoint the service or functionality that’s down, bring in the team or experts, and then get to solving the problem faster. Saving even a few minutes can have a big impact on decreasing the costs around incidents and outages, so having up-to-date service details at your fingertips can make all the difference.

How FireHydrant handled the SVB banking crisis

On Thursday, March 9, 2023, something was afoot at our primary bank, SVB. By Friday, March 10, 2023, messages from our investors helped us quickly understand that FireHydrant needed to maneuver through a complex incident that was unfolding. Operational incidents are incidents like every other.

Automatically Create Incidents from Alerts with Alert Routing

Shouldn’t your alerts be doing more of the work for you? A noisy channel with every alert from hundreds of monitors and microservices is a chaotic place to actually find the incidents that are impacting your customers. And it still requires a heck of a lot of human intervention. We think it’s time for something better. Today we’re releasing Alert Routing: the next phase of worry-free automation from FireHydrant.

How to define roles for your incident response team

Agility matters in incident response, and the easiest way to spring into action is by having a well-defined team in place ahead of time. The right people in the right roles will help you respond to and resolve incidents more quickly and efficiently. In fact, we found in the Incident Benchmark Report that incidents with roles assigned had a 42% lower mean time to resolution than those that didn’t. But what roles do you need to fill?