SRE

The latest News and Information on Service Reliability Engineering and related technologies.

Eliminating Toil In SRE

Jun 27, 2022 By Mbaoma Mary In Reliably

Toil is a term coined by Google which describes the repetitive and tedious tasks associated with running a production service. Toil tends to be manual and devoid of any long-term value. Toil is not just ‘work I do not like to do’. Each time an engineer engages with a production system, it represents time devoted to toil. These types of tasks get worse as your service grows even more extensive. Site Reliability Engineers (SRE) should spend less time on toil.

Read Post

Reliably

Read more about Eliminating Toil In SRE

Webinar Recap: How to Avoid Being On Call With Under-Instrumented Tools

Jun 24, 2022 By Jessica Kerr In Honeycomb

“It’s too expensive!” “Do we really need another tool?” “Our APM works just fine.” With strapped tech budgets and an abundance of tooling, it can be hard to justify a new expense—or something new for engineers to learn. Especially when they feel their current tool does the job adequately. But, does it?

Read Post

Honeycomb

Read more about Webinar Recap: How to Avoid Being On Call With Under-Instrumented Tools

Product Roundup: New Blameless Features in June 2022

Jun 22, 2022 By Phoebe Wang In Blameless

Summer means things are heating up. And things are definitely heating up at Blameless! We’ve been hard at work delivering new features and capabilities to our customers, so today I wanted to share a quick summary of all the latest. Here are 4 exciting product updates that enhance the way teams manage incidents and deliver reliable products to their customers.

Read Post

Blameless

Read more about Product Roundup: New Blameless Features in June 2022

Making Reliability A Critical Service Of An Organization

Jun 21, 2022 By Aimee Pearcy In Reliably

As systems continue to become more complex, reliability is becoming an increasingly important requirement. Organizations are quickly realizing that making reliability a critical part of their service means that other organizations will be less likely to cut costs on them. As a result of this, the field of service reliability engineering (SRE) has grown rapidly over the past few years.

Read Post

Reliably

Read more about Making Reliability A Critical Service Of An Organization

How to Establish Service Level Objectives In Software Engineering

Jun 19, 2022 By Samadrita Ghosh In Reliably

SLOs or Service Level Objectives are the foundation of Site Reliability Engineering (SRE). To correctly understand SLOs, the first step is to understand Service Level Indicators or SLIs. SLIs are metrics that measure the vitals of the service. These vitals are chosen based on two conditions. First, they are the features that the user is primarily concerned about. Second, they allow the engineering team to get an overview of the system’s health.

Read Post

Reliably

Read more about How to Establish Service Level Objectives In Software Engineering

Continuous Validation: What Is It And Why Is It Important?

Jun 18, 2022 By Catrin Haberfield In Reliably

By investing in a CI/CD pipeline, it’s entirely possible to automate a large part of the software development life cycle – letting businesses deliver high-quality, high-efficiency outputs with a faster time to market. But there are multiple elements to the CI/CD process, including the all-seeing eye that is continuous validation. So what exactly is continuous validation, and why should software developers bother to engage with it?

Read Post

Reliably

Read more about Continuous Validation: What Is It And Why Is It Important?

Continuous Documentation In A CI/CD World

Jun 18, 2022 By Aimee Pearcy In Reliably

Continuous documentation is the process of creating and maintaining code documentation incrementally throughout a project in a way that seamlessly incorporates it into the development workflow. It is a key part of improving reliability within an organization. It’s not just new features that need to be documented – anything useful from bug fixes, to how to get started using the code should be documented. It should also be updated frequently to ensure that it stays relevant.

Read Post

Reliably

Read more about Continuous Documentation In A CI/CD World

How To Build High-Performing Engineering Teams

Jun 18, 2022 By Charity Majors In Reliably

There is a distinctive gap opening up between the top engineers and the rest. The elite engineers represent the top few percent of engineering teams and are making incredible gains year on year in velocity, reliability, and human compatibility, whilst the bottom 50% are losing ground. The loss has nothing to do with engineering ability.

Read Post

Reliably

Read more about How To Build High-Performing Engineering Teams

Demo Day | Discover Developer-First Reliability

Jun 16, 2022 By Reliably In Reliably

Join us at this very first edition of Demo Days where this month we’ll be meeting the team at Reliably who will give a live product demo showcasing how their platform can help you to get better at operating with greater predictability and less anxiety. See Reliably in action and discover developer-first reliability as one of their experts will guide us through the product and its features.

View Video

Reliably

DevOps
SRE

Read more about Demo Day | Discover Developer-First Reliability

The value of blameless culture - from IC to C-Suite

Jun 16, 2022 By Tyler McGoffin In CircleCI

At CircleCI, CI has a second meaning: Continuous Improvement. We continuously seek out feedback not only to improve our code but to improve our processes and get better at our jobs along the way. This Continuous Improvement starts with one important company value: a blameless culture. Our blameless culture extends into every part of how we operate.

Read Post