Latest Videos

Failover and cloud aren't enough for reliability

Aug 26, 2025 By Gremlin In Gremlin

Amin Momin of @CapgeminiGlobal talks about reliability takes dedicated effort beyond just using the cloud and setting up failover. Full transcript: There are two misconceptions about reliability. One is people only think failover is reliability. Just doing the failover, that will be enough from the reliability point of view. That's the first one. And the second one: we are deployed into the cloud, so it is the service provider's responsibility to provide the reliability.

View Video

Gremlin

Read more about Failover and cloud aren't enough for reliability

True reliability takes the whole team

Aug 22, 2025 By Gremlin In Gremlin

Reliability takes the whole team working together. Full transcript: If you really want to get good at measuring your reliability, then you have to work together as a team. Once your software engineer organization has decided, "We're gonna test these applications to make sure that they have redundancy, availability, resilience." Just stick to that framework that you come up with as a team.

View Video

Gremlin

Read more about True reliability takes the whole team

Encourage the boring reliability work

Aug 20, 2025 By Gremlin In Gremlin

Proactive, regular reliability work is boring, repetitive, and EFFECTIVE. And if leadership wants the incredible results it brings, they have to encourage the right behavior.

View Video

Gremlin

Read more about Encourage the boring reliability work

Reliability upholds your promise to users

Aug 19, 2025 By Gremlin In Gremlin

Consistent systems are reliability systems according to Ganesh Seetharaman, Managing Director at @Deloitte. Full transcript: Strong reliability is demonstrated when systems consistently work as expected even during peak demand or unexpected events. When issues do happen, they are resolved quickly and transparently so users experience minimal disruption. Reliability also means data integrity. No matter how much stress the system is under, information needs to be accurate and secure.

View Video

Gremlin

Read more about Reliability upholds your promise to users

Reliability is when customers aren't impacted

Aug 14, 2025 By Gremlin In Gremlin

Ultimately, a system is reliable when customers and engineers can count on it. Full transcript: When I get to hear stories like, "Hey, we just had our holiday sales event kick off and everything went smoothly and I didn't have to wake up in the middle of the night." That is really the true definition of reliability these people that are constantly hands-on keyboard in charge of making sure that people like myself and like you aren't impacted when we're going to, for example, buy a new pair of sneakers, or we're going to get some sort of limited edition release that's coming out, right?

View Video

Gremlin

Read more about Reliability is when customers aren't impacted

Reliability isn't an afterthought

Aug 12, 2025 By Gremlin In Gremlin

“Reliability must be a crucial outcome for all of the architectures.” —Anish Behanan from @CapgeminiGlobal.

View Video

Gremlin

Read more about Reliability isn't an afterthought

Introducing Reliability Intelligence

Aug 11, 2025 By Gremlin In Gremlin

Reliability Intelligence draws on Gremlin expertise with every test to show you how the test failed and recommended remediation.

View Video

Gremlin

Read more about Introducing Reliability Intelligence

The riskiest thing you can do is not measure your risk

Aug 8, 2025 By Gremlin In Gremlin

Hiring good engineers is important, but it’s not enough to prevent outages. You need to measure and track your risk to get real results. Full transcript: My name's Jeff Nickoloff. I'm a principal engineer here at Gremlin. What I hear non-technical functions talk about is really they are much happier to sort of lean on their great engineers. Oh, we've got a great engineering culture. "We don't have reliability issues because we hire the best people.".

View Video

Gremlin

Read more about The riskiest thing you can do is not measure your risk

Avoid the Chaos Engineering bottleneck

Aug 6, 2025 By Gremlin In Gremlin

Chaos Engineering is great, but by itself it can create bottlenecks that limit your reliability journey. FULL TRANSCRIPT: One of the things we've learned while building Gremlin and being the first Chaos Engineering tool to market is with all the greatness that comes with this approach, we've learned some of the downfalls, some of the drawbacks. And one of those is how you scale this practice.

View Video

Gremlin

Read more about Avoid the Chaos Engineering bottleneck

Reliability is the absence of uncertainty

Aug 5, 2025 By Gremlin In Gremlin

Are your teams truly ready when they ship code? Amin Momin of @CapgeminiGlobal talks about how true reliability is the absence of uncertainty. Full transcript: Reliability to me signifies the absence of uncertainty. Whenever we go to production, we don't want anything to be unknown.

View Video

Gremlin

Read more about Reliability is the absence of uncertainty

Operations | Monitoring | ITSM | DevOps | Cloud

Failover and cloud aren't enough for reliability

True reliability takes the whole team

Encourage the boring reliability work

Reliability upholds your promise to users

Reliability is when customers aren't impacted

Reliability isn't an afterthought

Introducing Reliability Intelligence

The riskiest thing you can do is not measure your risk

Avoid the Chaos Engineering bottleneck

Reliability is the absence of uncertainty

Monthly Archive

Follow Us