Once an incident is detected, taking the right actions automatically and immediately is the easiest step to make a sustainable and measurable improvement to your MTTR.
Intent-based Capacity Planning is Google's approach to declare reliability intent for a service and then solve for the most efficient resource allocation plan dynamically. Learn how you can start using this approach to effectively manage the reliability of your services running on your Kubernetes cluster.
Mark Henderson has been a Site Reliability Engineer at Stack Overflow since 2015. Before this he worked as the sole systems administrator at a small software company in Sydney, Australia. These days, he lives in South Australia and works from home with his wife and two children.
In our series Squad Talks, Fulsmita debriefs our team on what makes them tick at Squadcast. Madhu from our Engineering team shares his thoughts on what it's like to work at ground zero of a fast growing tech startup. Here we go!
Squadcast is an Intelligent Incident management, monitoring & Alerting platform that improves your reliability by helping SRE and DevOps teams to adopt IT Incident Management best practices like intelligent alert routing, on-call rotations, collaboration, response automation, root cause analysis, blameless postmortems, etc.