Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Podcast: Break Things on Purpose | Ep. 11: Ryan Kitchens, Senior Site Reliability Engineer at Netflix

Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. We’re excited to kick off Season 2 of Break Things on Purpose next month. In anticipation of our next season, here’s a bonus show from our archives! Subscribe to Break Things on Purpose wherever you get your podcasts. Find us on Twitter at @BTOPpod or shoot us a note at podcast@gremlin.com!

How to make an ROI calculator and impress finance (an engineer's guide to ROI)

Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. Think back to the last time you wanted to purchase software for your organization. The software solves real problems and makes your team’s life easier. Then, finance delays or rejects your proposal. What’s going on?

Ensuring a smooth Kubernetes Dockershim Deprecation with Chaos Engineering

Trying to improve the reliability of your Kubernetes deployment? Start with these 5 chaos experiments. Kubernetes 1.20 is scheduled to be released next week, and this version contains a number of amazing enhancements including graceful node shutdown, more visibility into resource requests, and snapshotting volumes. But the change generating the most buzz is the deprecation of Docker as a container runtime.

Embracing virtual connections at AWS re:Invent 2020

Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. This year has seen a complete re-imagining of tech conferences. Some were cancelled or postponed, while others have evolved and embraced the opportunity to go virtual. This meant innovating to bring the in-person event experience online.

Secure Chaos Engineering on Kubernetes Clusters Without being a Noisy Neighbor

Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. Kubernetes is a powerful open source platform to build scalable, reliable systems, designed to be extensible and customizable for many use cases. Kubernetes provides the ability to scale individual pods, swap out runtimes, and control access to objects using namespaces.

Why modern testing requires Chaos Engineering

Modern applications are changing, and traditional testing practices are no longer up to the task. Learn more about the changing landscape of QA and how Chaos Engineering provides the necessary framework for testing modern applications. Chaos and Reliability Engineering techniques are quickly gaining traction as essential disciplines to building reliable applications. Many organizations have embraced Chaos Engineering over the last few years.

Knowing your systems and how they can fail: Twilio and AWS talk at Chaos Conf 2020

Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. This year’s Chaos Conf was packed full of incredible talks from some of the industry’s foremost experts on Chaos Engineering.

Grubhub and JPMC Shift Reliability Testing Left at Chaos Conf 2020

Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. Gremlin’s Chaos Conf is always an exciting event, bringing together leaders at the forefront of Chaos Engineering practices. This year was no exception, moving beyond defining Chaos Engineering to more advanced adoption and best practices discussions.

Looking back on Chaos Conf 2020

It’s already been a week since we closed our third annual Chaos Conf! While we were forced to take the conference online, this meant that more of you could join us. Over 3,500 people signed up to help make this the world’s largest Chaos Engineering conference. That’s 5x more than 2019, and nearly 10x more than 2018! This is a testament to the growth of Chaos Engineering as a practice across many different industries and around the world.

Is your microservice a distributed monolith?

Your team has decided to migrate your monolithic application to a microservices architecture. You’ve modularized your business logic, containerized your codebase, allowed your developers to do polyglot programming, replaced function calls with API calls, built a Kubernetes environment, and fine-tuned your deployment strategy. But soon after hitting deploy, you start noticing problems.