Operations | Monitoring | ITSM | DevOps | Cloud

SRE

The latest News and Information on Service Reliability Engineering and related technologies.

Essential Tools for Site Reliability Engineers

Site reliability engineers (SREs) are involved in scaling systems and making them reliable and efficient for organizations. But SREs often fail to build system resiliency when they do not have the right tools at their disposal. In this post, we’ll uncover five leading tools that SREs can use to drive the reliability and stability of computing systems. It also examines how SREs can use the tools to improve operations tasks and infrastructure processes.

Situation Room: On-Call Team Faces Worst Case of Sunday Scaries

Picture this: it’s Sunday night. You’re relaxing in bed, in that sweet spot where you’re geared up for Monday, but the fun of the weekend hasn’t yet faded. As you idly scroll through content on your phone, you see a message preview pop up. It’s to your work email. That’s bad. It’s from the hosting company you contract. That’s really bad. They’re saying they accidentally deleted the production database. That’s “jump out of bed” bad.

Self-Compassion Instead of Self-Blame

The tech industry is competitive and not without challenges. People are always growing and improving by pushing their limits. Innovation comes in many forms. In order to foster a healthy culture while allowing people to flourish, organizations must carefully enact policies. Growth should be encouraged while discouraging competition and comparison. One of the core policies organizations implement to achieve these goals is blamelessness.

DevOps & SRE Words Matter: How Our Language has Evolved

As the tech world changes, language changes with it. New technologies will always introduce new terms and descriptions to provide clear understanding. For example, the emergence of the cloud introduced language to describe the changing relationship between servers and clients. Then, of course, product providers will also dictate how their products are to be described, i.e. describing services as “cloud-native”.