The latest News and Information on Service Reliability Engineering and related technologies.
As a Customer Advocate, I talk to a lot of prospective Honeycomb users who want to understand how observability fits into their existing Site Reliability Engineering (SRE) practice. While I have enough of a familiarity with the discipline to get myself into trouble, I wanted to learn more about what SREs do in their day-to-day work so that I’d be better able to help them determine if Honeycomb is a good fit for their needs.
When planning the SRE from home virtual even last month, one of the central themes was wellness and the need for self-care for SREs, especially during this time of crisis. Knowing how stressful an SRE’s day can be, combined that with a global pandemic and new working conditions, we knew we needed programming around SRE and IT wellness for SRE from Home. We’re all looking for ways to maintain a healthy work-life, but hearing this from your peers was especially important.
Without SRECon happening this year and the world turned upside down from COVID-19, we set out to hold a virtual event to bring SREs together to share their experiences of what has changed. Last week’s SRE from Home was exactly that. With 1900 registrants, 20 lively Slack channels, six illuminating and entertaining talks from a diverse range of experts in the field and our #askanSRE panel answering attendees’ questions with a candid generosity, it was an amazing, jam-packed day.
Site Reliability Engineering (SRE) is a practice for managing the reliability of systems that began at Google in the early 2000s. Ben Treynor Sloss from Google started the first SRE team and coined the name.
We recently released Catchpoint’s SRE Report 2020 that analyzed results from the SRE survey we conducted early this year along with a recent addendum survey. The report offers a detailed look at the current state of SRE and how the shift to an all-remote work environment has impacted SRE teams. In this blog, we take a deeper look at one of the report highlights – ‘Heavy Ops Workload Comes at a Cost’.
“Welcome to Tomorrowland.” That’s how Moogsoft Chairman and CEO Phil Tee kicked off the launch event of Moogsoft Express, the next-generation AIOps and observability solution built from the ground up for DevOps and SRE teams. The reference to a better future is fitting. With its arrival, Moogsoft Express helps these teams maintain visibility and control over increasingly complex CI/CD pipelines, so they can detect issues earlier, fix them faster and prevent outages.
Our 2020 SRE Report is ready! We launched the SRE survey 2020 this January with the goal of understanding the current state of SRE. The survey covered a range of topics including: As we neared the end of the survey period, the SRE community was in the midst of a sudden change. SRE teams were forced to migrate to all-remote IT. We realized we would not be able to provide an accurate analysis without considering this shift in how SRE teams were operating in this new environment.