Latest News

Five ways Gremlin helps organizations meet DORA requirements

May 7, 2024 By Ryan Detwiller In Gremlin

Enacted by the European Union, the Digital Operational Resilience Act (DORA) establishes new standards for digital operational resilience in the financial sector. DORA changes the financial sector's approach to digital security and resilience by imposing stringent Information and Communication Technology (ICT) risk management, incident reporting, third-party risk management, and regular testing.

Read Post

Gremlin

Read more about Five ways Gremlin helps organizations meet DORA requirements

Three roles you need for reliability success

May 7, 2024 By Gavin Cahill In Gremlin

It’s one thing to say that reliability is a priority for your organization, and a whole other thing to make actual, demonstrable improvements in the availability of your applications. Sadly, it’s common for organizations to invest time, money, and effort into improving reliability only to barely nudge the needle on incidents and downtime. But there are hundreds of companies successfully improving their reliability posture—and doing it at enterprise scale.

Read Post

Gremlin

Read more about Three roles you need for reliability success

How to build reliable services with unreliable dependencies

May 2, 2024 By Andre Newman In Gremlin

In an earlier blog, we looked at slow dependencies and how they can impact the reliability of other services. While we explored what happens when dependencies are degraded, what happens when dependencies outright fail? What can you do when your application or service sends a request to another service, and nothing comes back? We’ll answer this question by using Gremlin to proactively test a service with multiple dependencies.

Read Post

Gremlin

Read more about How to build reliable services with unreliable dependencies

How to make your services resilient to slow dependencies

Apr 24, 2024 By Andre Newman In Gremlin

When discussing reliability, we tend to focus on the things that we have control over: applications, virtual machine instances, deployment patterns, etc. But this ignores a significant and ever-growing part of nearly all modern software: dependencies. Dependencies are services that provide extra functionality for other services and applications. For instance, many websites depend on databases, caches, payment processors, and similar services in order to function.

Read Post

Gremlin

Read more about How to make your services resilient to slow dependencies

Hitting reliability goals in the face of layoffs

Apr 23, 2024 By Jeff Nickoloff In Gremlin

It’s never easy when layoffs hit your organization. In addition to the personal impact of losing friends and coworkers from your team, those who remain are left trying to achieve the same business goals with less people and resources. Unfortunately, layoffs and restructuring have become a common part of business. But you’re not alone. Your partners (including Gremlin) are here to help you navigate your new reality.

Read Post

Gremlin

Read more about Hitting reliability goals in the face of layoffs

How to ensure your Kubernetes Pods and containers can restart automatically

Apr 16, 2024 By Andre Newman In Gremlin

As complex as Kubernetes is, much of it can be distilled to one simple question: how do we keep containers available for as long as possible? All of the various utilities, features, platform integrations, and observability tools surrounding Kubernetes tend to serve this one goal. Unfortunately, this also means there’s a lot of complexity and confusion surrounding this topic. After all, most people would agree that availability is important, but how exactly do you go about achieving it?

Read Post

Gremlin

Read more about How to ensure your Kubernetes Pods and containers can restart automatically

How to ensure your Kubernetes cluster can tolerate lost nodes

Apr 12, 2024 By Andre Newman In Gremlin

Redundancy is a core strength of Kubernetes. Whenever a component fails, such as a Pod or deployment, Kubernetes can usually automatically detect and replace it without any human intervention. This saves DevOps teams a ton of time and lets them focus on developing and deploying applications, rather than managing infrastructure.

Read Post

Gremlin

Read more about How to ensure your Kubernetes cluster can tolerate lost nodes

Top 7 Kubernetes Chaos Engineering Tools

Apr 10, 2024 By Daniel Olaogun In Speedscale

Enhance kubernetes resilience with AWS Fault Injection Simulator, LitmusChaos, Gremlin, Chaos Monkey, ChaosBlade, Azure Chaos Studio, and Speedscale.

Read Post

Speedscale

Read more about Top 7 Kubernetes Chaos Engineering Tools

How to standardize resiliency on Kubernetes

Apr 10, 2024 By Gavin Cahill In Gremlin

There’s more pressure than ever to deliver high-availability Kubernetes systems, but there’s a combination of organizational and technological hurdles that make this ‌easier said than done. Technologically, Kubernetes is complex and ephemeral, with deployments that span infrastructure, cluster, node, and pod layers. And like with any complex and ephemeral system, the large amount of constantly-changing parts opens the possibility for sudden, unexpected failures.

Read Post

Gremlin

Read more about How to standardize resiliency on Kubernetes

Where to automate resilience testing in your SDLC

Apr 9, 2024 By Ryan Detwiller In Gremlin

When organizations begin to deploy resilience testing or Chaos Engineering, there’s a natural question: can we integrate this with our CI/CD pipeline or release automation tools? After all, you’re likely running unit, performance, and integration tests already—is resiliency different? The short answer is yes—to both. Integration is possible, but resiliency is different, so automation is a nuanced conversation.

Read Post

Gremlin

Read more about Where to automate resilience testing in your SDLC

Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Five ways Gremlin helps organizations meet DORA requirements

Three roles you need for reliability success

How to build reliable services with unreliable dependencies

How to make your services resilient to slow dependencies

Hitting reliability goals in the face of layoffs

How to ensure your Kubernetes Pods and containers can restart automatically

How to ensure your Kubernetes cluster can tolerate lost nodes

Top 7 Kubernetes Chaos Engineering Tools

How to standardize resiliency on Kubernetes

Where to automate resilience testing in your SDLC

Monthly Archive

Follow Us