Operations | Monitoring | ITSM | DevOps | Cloud

How to keep track of what's running in your Gremlin team

•Part of the Gremlin Office Hours series: A monthly deep dive with Gremlin experts. Reliability testing is ongoing, and tracking that work can be difficult in large organizations. According to our own product metrics, teams run an average of 200 to 500 tests each day! With so much happening, it’s hard to keep track of everything going on—unless you use Gremlin.

How a major retailer tested critical serverless systems with Failure Flags

Not too long ago, a customer came to us with a high-value use case. The customer, a major apparel company with retail and e-commerce applications, needed to prove that a critical service of their payment applications could failover correctly between regions in case of an outage. But there was one snag: the service was built using AWS Lambda. This meant infrastructure-focused tests would have trouble replicating the failure conditions necessary to test the failover due to Lambda’s serverless model.

Simulating artificial intelligence service outages with Gremlin

The AI (artificial intelligence) landscape wouldn’t be where it is today without AI-as-a-service (AIaaS) providers like OpenAI, AWS, and Google Cloud. These companies have made running AI models as easy as clicking a button. As a result, more applications have been able to use AI services for data analysis, content generation, media production, and much more.