Observability of Distributed Systems (class SRE implements DevOps)

Google Operations

Sep 18, 2018

In their previous video, Liz and Seth reduced actionable alerts by focusing on Service Level Objectives (SLOs), but how can we make our systems observable, instead of only being able to debug what we've thought to monitor in the past?

In this video, you learn how structured logs, metrics, and traces help SRE and DevOps practitioners find out where the systems are broken. We'll use metrics to find slow or erroring queries, traces to find interactions between components, and logs to understand the errors in more detail.

To get started with this functionality, Google Cloud offers Stackdriver Service Monitoring.

Observability of Distributed Systems (class SRE implements DevOps)

Monthly Archive

Follow Us