The latest News and Information on Service Reliability Engineering and related technologies.
Digital business is an imperative for 21st-century companies. Increasingly, organizations are directing investments toward technologies that deliver outcomes fast and enable more resilient digital business models. In this landscape, incidents such as software bugs, power outages, or downed networks have major consequences that affect both revenue and customer loyalty.
Site reliability engineering continues to gain traction in software development and IT. SRE is at the crossroads of software development and IT operations. In Ben Treynor’s words, SRE is “what happens when you ask a software engineer to design an operations function.” Site reliability engineering is a way for developers to actively build services and functions to improve the resilience of people, processes and technical systems.
“It’s too expensive!” “Do we really need another tool?” “Our APM works just fine.” With strapped tech budgets and an abundance of tooling, it can be hard to justify a new expense—or something new for engineers to learn. Especially when they feel their current tool does the job adequately. But, does it?
At CircleCI, CI has a second meaning: Continuous Improvement. We continuously seek out feedback not only to improve our code but to improve our processes and get better at our jobs along the way. This Continuous Improvement starts with one important company value: a blameless culture. Our blameless culture extends into every part of how we operate.
For the past four years, Catchpoint and various partners have been running a yearly SRE Survey. This year, Blameless is excited to partner with Catchpoint for the fifth annual survey. We want to hear from you if you are in a DevOps or SRE role or even if you work on reliability with some other title or role. There are tremendous, valuable learnings when we listen closely to practitioners.