Blameless

How to Choose Monitoring Tools for DevOps and SRE

Jul 23, 2020 By Emily Arnott In Blameless

When developing for reliability or implementing resilient DevOps practices, the heart of your decision-making is data. Without carefully monitoring key metrics like uptime, network load, and resource usage, you’ll be blind to where to spend development efforts or refine operation practices. Fortunately, a wide variety of monitoring tools are available to help you collect and get visibility into this data.

Read Post

Blameless

DevOps
SRE

Read more about How to Choose Monitoring Tools for DevOps and SRE

Leaders, Here's how to Encourage Full Service Ownership

Jul 22, 2020 By Hannah Culver In Blameless

Service ownership is becoming common practice and its benefits are well-known. These perks include happier customers, aligned teams, and fewer incidents. While this sounds great, it’s often easier said than done, requiring a culture and mindset shift. Leadership will need to encourage and empower teams to adopt the “you build it, you run it” mentality. Here are some ways leaders can help get teams on board.

Read Post

Blameless

Read more about Leaders, Here's how to Encourage Full Service Ownership

SREview Issue #3 July 2020

Jul 21, 2020 By Blameless Community In Blameless

Here’s the July issue of SREview! This monthly zine features epic Tweets, content, and events happening in the SRE and resilience engineering community.

Read Post

Blameless

Read more about SREview Issue #3 July 2020

How SLOs Help Your Team with Service Ownership

Jul 21, 2020 By Hannah Culver In Blameless

Service ownership is becoming a best practice for teams looking to innovate while maintaining the level of reliability that customers expect. Service ownership means seeing the service through its entire lifecycle. In short, it means you build it, you run it. You’ll be responsible for the service’s security, reliability, performance, and quality. This doesn’t mean you won’t have help from SREs to optimize or automate toil.

Read Post

Blameless

Read more about How SLOs Help Your Team with Service Ownership

Webinar: Modern Metrics to Understand Operational Health

Jul 21, 2020 By Blameless In Blameless

In this webinar, you'll learn what are the SRE metrics to better gain insights into operations health. We walk through common challenges and pain points in understanding operations health, metrics to measure based on your maturity journey, and a live demo to show solutions in action.

View Video

Blameless

Read more about Webinar: Modern Metrics to Understand Operational Health

The Essential List of Top SRE Resources

Jul 17, 2020 By Emily Arnott In Blameless

Are you looking to get up to speed on SRE fundamentals with the best SRE books and best DevOps books? Or are you hoping to expand your SRE knowledge into new domains? Either way, we’ve got you covered in our list of essential SRE resources!

Read Post

Blameless

Read more about The Essential List of Top SRE Resources

5 Tips for Getting Alert Fatigue Under Control

Jul 16, 2020 By Hannah Culver In Blameless

What happens when you receive a notification that something is wrong with your system and you have no clue what it means, or why you’re receiving that alert? Maybe you have to parse through the alert conditions to suss out what the alert indicates, or maybe you need to ping a coworker and ask. Not knowing what to do with an alert also contributes to alert fatigue, because it increases the toil and time required to respond.

Read Post

Blameless

Read more about 5 Tips for Getting Alert Fatigue Under Control

Leadership and Innovation with Instacart's VP of Infrastructure

Jul 15, 2020 By Blameless Community In Blameless

Blameless CEO Ashar Rizqi recently had the pleasure of interviewing Dustin Pearce in a virtual executive fireside chat and AMA. Dustin is an experienced leader in scaling hyper-growth, cloud-native companies, as the VP of Infrastructure at Instacart and having previously served as Head of Service Engineering at Slack.

Read Post

Blameless

Read more about Leadership and Innovation with Instacart's VP of Infrastructure

Promoting Continuous Learning with SRE

Jul 14, 2020 By Hannah Culver In Blameless

With the extreme changes we’ve all been through these last several months, it should come as no surprise that our jobs have changed drastically, too. We’re working remotely. We’re dealing with increased resource constraints. Our services are receiving more traffic than usual, and we’re tasked with keeping things up and running. Our work-as-done may not match what we did at the beginning of 2020.

Read Post

Blameless

Read more about Promoting Continuous Learning with SRE

Teamwork and Culture in the Era of Remote Work

Jul 13, 2020 By Hannah Culver In Blameless

With decreased resources, increased stress and cognitive load, and social distancing policies, many teams are under extreme pressure. Without over-communication and special attention paid to organizational culture, teams can become fractured, anxious, or disillusioned.

Read Post

Blameless

Read more about Teamwork and Culture in the Era of Remote Work

Subscribe to Blameless

Operations | Monitoring | ITSM | DevOps | Cloud

Blameless

How to Choose Monitoring Tools for DevOps and SRE

Leaders, Here's how to Encourage Full Service Ownership

SREview Issue #3 July 2020

How SLOs Help Your Team with Service Ownership

Webinar: Modern Metrics to Understand Operational Health

The Essential List of Top SRE Resources

5 Tips for Getting Alert Fatigue Under Control

Leadership and Innovation with Instacart's VP of Infrastructure

Promoting Continuous Learning with SRE

Teamwork and Culture in the Era of Remote Work

Monthly Archive

Follow Us