Operations | Monitoring | ITSM | DevOps | Cloud

August 2018

Simple/hard metrics that help reduce MTTR when looking for a root cause

Recently there was a mini-incident in a data center where we host our servers. It did not affect our service after all. And thanks to the right operational metrics, we’ve been able to instantly figure our what’s happening. But then an thought came up to me, how we would’ve been racking our heads trying to understand what’s happening without 2 simple metrics.