%term

3 things you can do to get closer to five nines

Oct 2, 2025 By Andre Newman In Gremlin

5 minutes. That’s how much downtime some of the world’s largest enterprises will tolerate. For most organizations, five nines (99.999%) of availability sounds like a pipedream. But the trick to increasing availability isn’t massive infrastructure spending or complex system redesigns. All it takes are three key practices that any team can adopt and implement. In this post, we’ll present these practices and how we implement them at Gremlin.

Read Post

Gremlin

Read more about 3 things you can do to get closer to five nines

You have to talk about reliability failures to fix them

Oct 2, 2025 By Gremlin In Gremlin

Gremlin Founder & CEO Kolton Andrus talks about how you can’t improve your reliability unless you’re willing to talk about it.

View Video

Gremlin

Read more about You have to talk about reliability failures to fix them

How to improve reliability? Test and test often

Sep 30, 2025 By Gremlin In Gremlin

The best way to improve reliability? According to Nick Mason with Gremlin, it’s to test and test often.

View Video

Gremlin

Read more about How to improve reliability? Test and test often

Security vs. ops: the two sides of reliability

Sep 25, 2025 By Gremlin In Gremlin

Security and ops work together to keep your systems reliable, but why do we treat them so differently? Reliability results start when you proactively take charge of your infrastructure and application risks. Transcript: When we talk about reliability in the software space and the digital operations space, you really end up falling into these two different mindsets.

View Video

Gremlin

Read more about Security vs. ops: the two sides of reliability

Resilience vs. Reliability: What's the difference?

Sep 24, 2025 By Gremlin In Gremlin

We hear a lot about resilience and reliability, but what’s the difference? Gremlin Founder & CEO Kolton Andrus gives you a quick, easy comparison.

View Video

Gremlin

Read more about Resilience vs. Reliability: What's the difference?

Reliability is different for every business

Sep 23, 2025 By Gremlin In Gremlin

Anish Behanan from @Capgemini talks about how reliability is different for every business, and you have to know what it means for your organization.

View Video

Gremlin

Read more about Reliability is different for every business

Reliability means smooth on-call and a strong team

Sep 19, 2025 By Gremlin In Gremlin

True reliability is when your engineers have confidence in their systems and their teams. Full transcript: Reliability to me means my on-call shift is gonna be smooth because everybody is making the attempts to be smart about the type of code that we're writing. And we're regularly testing to make sure that our system has redundancy and can withstand latency spikes, it can withstand resource spikes.

View Video

Gremlin

Read more about Reliability means smooth on-call and a strong team

How we keep Gremlin at five 9s

Sep 17, 2025 By Gremlin In Gremlin

Kolton Andrus, Founder and CEO of Gremlin, walks you through how we keep Gremlin at five 9s availability.

View Video

Gremlin

Read more about How we keep Gremlin at five 9s

How to use Gremlin to test business processes

Sep 16, 2025 By Gremlin In Gremlin

Amin Momin of Capgemini gives an example of injecting faults to improve reliability with a capital market exchange.

View Video

Gremlin

Read more about How to use Gremlin to test business processes

How to make Netflix reliable: Address low-hanging fruit

Sep 11, 2025 By Gremlin In Gremlin

Reliability doesn’t have to be fancy and dramatic. Kolton and his team dramatically improved Netflix reliability by focusing on low-hanging fruit. FULL TRANSCRIPT: My first holiday peak at Netflix, where my VP of engineering came to me and he said, "Kolton, what do you think the chance we make it through the holiday peak without an outage is?" I thought about it for a minute and I said, "50/50.".

View Video

Gremlin

Read more about How to make Netflix reliable: Address low-hanging fruit

Operations | Monitoring | ITSM | DevOps | Cloud

3 things you can do to get closer to five nines

You have to talk about reliability failures to fix them

How to improve reliability? Test and test often

Security vs. ops: the two sides of reliability

Resilience vs. Reliability: What's the difference?

Reliability is different for every business

Reliability means smooth on-call and a strong team

How we keep Gremlin at five 9s

How to use Gremlin to test business processes

How to make Netflix reliable: Address low-hanging fruit

Monthly Archive

Follow Us