Improving Reliability with Error Budgets, Metrics, and Tracing in Stackdriver (Cloud Next '18)

Improving Reliability with Error Budgets, Metrics, and Tracing in Stackdriver (Cloud Next '18)

Learn about the Site Reliability Engineering principle of error budgets and best practices for SLO-focused alerting and focused debugging. Members of the Stackdriver and Customer Reliability Engineering teams will demonstrate how Stackdriver tooling inspired by the needs of SREs at Google brings you the ability to run services more reliability and with fewer false positive signals through tracking and alerting upon error budgets and debugging with the exemplar technique during an outage.