Observability of Distributed Systems (class SRE implements DevOps)

Observability of Distributed Systems (class SRE implements DevOps)

In their previous video, Liz and Seth reduced actionable alerts by focusing on Service Level Objectives (SLOs), but how can we make our systems observable, instead of only being able to debug what we've thought to monitor in the past?

In this video, you learn how structured logs, metrics, and traces help SRE and DevOps practitioners find out where the systems are broken. We'll use metrics to find slow or erroring queries, traces to find interactions between components, and logs to understand the errors in more detail.

To get started with this functionality, Google Cloud offers Stackdriver Service Monitoring.