What alerts should you have for serverless applications?
A key metric for measuring how well you handle system outages is the Mean Time To Recovery or MTTR. It’s basically the time it takes you to restore the system to working conditions. The shorter the MTTR, the faster problems are resolved and the less impact your users would experience and hopefully the more likely they will continue to use your product! And the first step to resolve any problem is to know that you have a problem.