Operations | Monitoring | ITSM | DevOps | Cloud

October 2024

Three serverless reliability risks you can solve today using Failure Flags

Serverless platforms make it incredibly easy to deploy applications. You can take raw code, push it up to a service like AWS Lambda, and have a running application in just a few seconds. The serverless platform provider assumes responsibility for hosting and operating the platform, freeing you up to focus on your application. Naturally, this raises a question: if something goes wrong, who’s responsible?

Office Hours: How to test serverless applications using Failure Flags

Part of the Gremlin Office Hours series: A monthly deep dive with Gremlin experts. Serverless applications are ideal for deploying scalable applications without having to manage infrastructure. However, this also makes it difficult to test their reliability. It’s easy to simulate a network outage or latency when you have direct access to the host that your software’s running on. What do you do when you only have control over the code?

Making serverless applications reliable and bug-free

Building applications using serverless technology on AWS—like AWS Lambda and Amazon API Gateway—can be incredibly powerful. You get to scale effortlessly and focus on writing code without worrying about managing servers. But as your application grows and spreads across hundreds or even thousands of cloud resources, keeping track of errors and fixing issues quickly becomes a big challenge.

Reduce your AWS Step Functions' error remediation time by redriving executions directly from Datadog

AWS enables customers to retry or redrive Step Functions executions to continue any failed executions of Standard Workflows from their points of failure while maintaining all inputs. For example, if you find broken downstream logic in your code or experience unexpected errors upon execution, you can remediate those errors by fully re-running an execution or use redrive to continue this execution.