Reliability should be about empowering teams to make more resilient software

Reliability should be about empowering teams to make more resilient software

Jun 11, 2024

Check out how a customer integrated standardize testing into their CI/CD pipeline with minimal lift from individual teams.

"We want to empower teams such that they can do it by themselves with minimal investment. So we have started working on the first step of it, which is the automated setup where a product or feature team just brings their environment and provides us the servers or containers or wherever they're deployed their stack in either JSON or YAML format. And we are able to go in and set up Gremlin agents on them. Another area was we would wait for subject matter experts to figure out what should be the resiliency plan, what should be the resiliency test.

So we want to build a recommendation engine where once you provide this is what your setup looks like, we could do things like say, "Oh, I see you have a Mongo with a three node cluster. From previous patterns, we have seen that there could be issues here. So we recommend that you do a black hole attack where you take away your primary and see what happens with your election.' "
— Kaushal Dalvi, Sr. Principal Engineer, UKG