The riskiest thing you can do is not measure your risk

Aug 8, 2025

Hiring good engineers is important, but it’s not enough to prevent outages. You need to measure and track your risk to get real results. Find out how to be reliable with Gremlin → https://www.gremlin.com/

Full transcript:

  My name's Jeff Nickoloff. I'm a principal engineer here at Gremlin.

 What I hear non-technical functions talk about is really they are much happier to sort of lean on their great engineers. Oh, we've got a great engineering culture. "We don't have reliability issues because we hire the best people."

And while I'm sure that they hire very good people, there's this tendency to lean on this elasticity of skill and effort and to say, "Well, we're just not gonna have any risks. I mean, we're not gonna measure them, but we're not gonna realize any risks because our people are just so good that those won't happen." 

 I'm a big accountability person. If you want something to change about your systems or your business or your relationships with your customer, you have to put in place systems of accountability to actually reinforce that change.

 This means measuring your reliability, being proactive about it, not reactive.

You have to say, "Hey, how reliable are we today?" And you're not gonna get exact numbers, but you can get, you know, it's a risk calculation, right? So you have to, you have to proactively measure the risk.

 People and teams that put in the time and effort to be very intentional about what risks they have, understanding what risks they have, and understanding what resources they have available to address those risks… Just the fact that they're thinking about it, dialoguing about it, gives them a huge leg up when it comes to actual operational performance.

There's nothing riskier that you can do than not measure your risk.