December 2021

SRE Predictions 2022 | Blameless SRE

Dec 28, 2021 By Emily Arnott In Blameless

As the new year approaches, we at Blameless like to ponder the future of Reliability Engineering. For 2021, we predicted that the practice of site reliability engineering (SRE) would continue to grow in terms of adoption, we would see adoption increase faster among smaller organizations, and SRE practices would get more attention to drive adoption compared to hiring. We’re sure you’ll agree that these trends have indeed strengthened in the last year.

Read Post

Blameless

Read more about SRE Predictions 2022 | Blameless SRE

Incident Management Software | The Best Tools for Your Team

Dec 21, 2021 By Noor-ul-Anam Ruqayya In Blameless

Wondering about Incident Management Software? We explain the best incident management software tools and how they work.

Read Post

Blameless

Read more about Incident Management Software | The Best Tools for Your Team

SREs, Observability, and Automation

Dec 20, 2021 By Jason Bloomberg In Moogsoft

Automation takes repetitive tasks off professionals’ plates, empowering them to free up time to focus on more valuable activities. Moogsoft’s API-driven automation capabilities enable SREs to make better use of their time, leading to better results for the business.

Read Post

Moogsoft

Read more about SREs, Observability, and Automation

What is the Purpose of Observability? In a Word, Innovation

Dec 20, 2021 By Chris Tozzi In Broadcom

Asking an IT engineer or SRE to define the purpose of observability is kind of like asking someone to explain the purpose of life: There are lots of different opinions out there, and no way of proving any of them right or wrong. You could argue that observability is just a buzzword that refers to what used to be called monitoring.

Read Post

Broadcom

Read more about What is the Purpose of Observability? In a Word, Innovation

A Site Reliability Engineer's Guide to the Holiday Season

Dec 17, 2021 By JJ Tang In Rootly

SREs face special challenges during the holidays. Here’s how to manage them.

Read Post

Rootly

Read more about A Site Reliability Engineer's Guide to the Holiday Season

Beyond Monitoring and IT Ops: Understanding How Observability Helps the SRE

Dec 17, 2021 By Charles Araujo In Moogsoft

An explanation of observability that highlights the role observability data play in supporting the active role of SREs as they reduce toil, improve uptime, and judiciously consume the error budget.

Read Post

Moogsoft

Read more about Beyond Monitoring and IT Ops: Understanding How Observability Helps the SRE

CloudOps (Everything You Should Know)

Dec 16, 2021 By Myra Nizami In Blameless

Wondering about CloudOps? We explain what CloudOps is, how it relates to DevOps, and how teams can use CloudOps to best manage cloud-native development.

Read Post

Blameless

Read more about CloudOps (Everything You Should Know)

Anomaly Detection

Dec 15, 2021 By Vince Power In Broadcom

IT Operations has a wide spectrum of roles and responsibilities. The positions range from level 1 (L1) operators to Site Reliability Engineers (SREs) and everything in between. L1 operators, for example, are (often) almost exclusively reactive. They feed off the constant stream of incidents reported by clients and events that are reported by monitoring and alerting systems. This is in contrast to SREs, who work at the other end of the spectrum.

Read Post

Broadcom

Read more about Anomaly Detection

How Disaster Ready are Your Backup Systems, Really?

Dec 14, 2021 By Emily Arnott In Blameless

In SRE, we believe that some failure is inevitable. Complex systems receiving updates will eventually experience incidents that you can’t anticipate. What you can do is be ready to mitigate the damage of these incidents as much as possible. One facet of disaster readiness is incident response - setting up procedures to solve the incident and restore service as quickly as possible. Another strategy involves reducing the chances for failure with tactics like reducing single points of failure.

Read Post

Blameless

Read more about How Disaster Ready are Your Backup Systems, Really?

Practical Guide to SRE: Infrastructure-as-Code (IaC)

Dec 10, 2021 By Quentin Rousseau In Rootly

An overview of how SREs can benefit from Infrastructure-as-Code.

Read Post

Rootly

Read more about Practical Guide to SRE: Infrastructure-as-Code (IaC)

7 DevOps Principles Every Team Needs to Practice

Dec 9, 2021 By Myra Nizami In Blameless

If you are optimizing your current DevOps processes, we can help. We’ll explain the 7 key principles of DevOps and how to put them into practice.

Read Post

Blameless

Read more about 7 DevOps Principles Every Team Needs to Practice

SRE Incident Management: Overview, Techniques, and Tools

Dec 8, 2021 By Jacob Hall In Dotcom-Monitor

In the world of a site reliability engineer (SRE), failure is not only an option, but also expected. Systems, web applications, servers, devices, etc., are all prone to performance issues and unexpected outages at some point. It is an unavoidable fact. These unexpected failures can lead to huge revenue losses, customer trust and depending on the industry, maybe fines. Fortunately, SRE incident management is one of the core practices used to limit the disruption caused by unexpected issues.

Read Post

Dotcom-Monitor

Read more about SRE Incident Management: Overview, Techniques, and Tools

Who Needs Site Reliability Engineers (SREs)?

Dec 3, 2021 By JJ Tang In Rootly

Although every company can benefit from SREs, some need SREs more than others.

Read Post

Rootly

Read more about Who Needs Site Reliability Engineers (SREs)?

What can SREs do to make holiday season's peak traffic less chaotic?

Dec 3, 2021 By Vardhan NS In Squadcast

Holiday season's peak traffic is the most challenging period for SREs and on-call engineers. In this blog, we have highlighted the things that SREs can do to make the holiday season less chaotic. The recently concluded Black Friday weekend could have potentially been the most challenging shift for on-call engineers working in the Retail or E-Commerce sector. Since such peak-traffic events push the system to the limits, engineering teams are engulfed in a lot of tension preparing for it.

Read Post

Squadcast

Read more about What can SREs do to make holiday season's peak traffic less chaotic?

DevOps Workflow | A Complete Guide & Best Practices

Dec 2, 2021 By Myra Nizami In Blameless

Curious about DevOps Workflow? We explain the DevOps process, how automation relates to workflow, and best practices for workflow design DevOps is a methodology that involves Development and Operations working together during the development process. Workflow is the sequence in which tasks occur. DevOps workflow relies heavily on automation and involves: Using DevOps, teams can increase collaboration and improve processes to create more stable and manageable processes.

Read Post

Blameless

Read more about DevOps Workflow | A Complete Guide & Best Practices

Site Reliability Engineering, Observability, and the Tradeoffs of Modern Software

Dec 2, 2021 By Jason Bloomberg In Moogsoft

This blog post defines SRE by explaining SLOs and error budgets, highlighting the innovation vs. reliability tradeoff.

Read Post

Moogsoft

Read more about Site Reliability Engineering, Observability, and the Tradeoffs of Modern Software

Operations | Monitoring | ITSM | DevOps | Cloud

December 2021

SRE Predictions 2022 | Blameless SRE

Incident Management Software | The Best Tools for Your Team

SREs, Observability, and Automation

What is the Purpose of Observability? In a Word, Innovation

A Site Reliability Engineer's Guide to the Holiday Season

Beyond Monitoring and IT Ops: Understanding How Observability Helps the SRE

CloudOps (Everything You Should Know)

Anomaly Detection

How Disaster Ready are Your Backup Systems, Really?

Practical Guide to SRE: Infrastructure-as-Code (IaC)

7 DevOps Principles Every Team Needs to Practice

SRE Incident Management: Overview, Techniques, and Tools

Who Needs Site Reliability Engineers (SREs)?

What can SREs do to make holiday season's peak traffic less chaotic?

DevOps Workflow | A Complete Guide & Best Practices

Site Reliability Engineering, Observability, and the Tradeoffs of Modern Software

Monthly Archive

Follow Us