True reliability takes the whole team
Reliability takes the whole team working together. Find out how to be reliable with Gremlin → https://www.gremlin.com/
Full transcript:
If you really want to get good at measuring your reliability, then you have to work together as a team.
Once your software engineer organization has decided, "We're gonna test these applications to make sure that they have redundancy, availability, resilience." Just stick to that framework that you come up with as a team.
Once that team is all on the same page, then reliability just becomes a part of the culture. It just becomes something that is just an everyday habit. Your engineers are gonna be logging into your observability tool and checking on things regularly. You're gonna be checking your sentry or whatever alerting system you have set up. You're testing regularly to see, “How does my application respond to different faults or different spikes in resources or latency?”
Those are all things that have to be top of mind for a software engineer, and not just a software engineer, but the entire organization needs to be on board.