Operations | Monitoring | ITSM | DevOps | Cloud

July 2018

Drilling down into Stackdriver Service Monitoring

If you’re responsible for application performance and availability, you know how hard it can be to see it through the eyes of your customers and end users. We think that’s really going to change with last week’s introduction of Stackdriver Service Monitoring, a new tool for monitoring how your customers perceive your applications, and that then lets you drill down to the underlying infrastructure when there’s a problem.

Transparent SLIs: See Google Cloud the way your application experiences it

Like all good IT organizations, you religiously measure the performance and availability of your services and applications. But if those apps run in the cloud, critical components are often delivered by a third party or the cloud provider. In the case of a service disruption or degraded performance, how do you know what the problem is—your code, the network, or the provider? And, if the problem is with the service provider, how do you convince them to take action as quickly as possible?

Centralized Logging Solution for Google Cloud Platform (Cloud Next '18)

In this session, we’ll give practical guidance on consolidating and managing your logs, share tips on both what to log and what not to log, discuss logging agents and their potential pitfalls, and show you how to extract value from your log entries for reporting and alerting on logs.

Visualizing Network Topologies and Traffic (Cloud Next '18)

In this session, we will look at which use cases in the field of network monitoring and management are relevant in a cloud environment and which data Google Cloud Platform provides to gain insights. We will then demo how to visualize traffic flows and topologies using a mix of Google and Open Source tools.

Optimizing and Troubleshooting Your Application, the Google Way (Cloud Next '18)

In this session, you’ll learn about the value of these kinds of tools, how you can automatically extract telemetry from your app with OpenCensus, and will receive a demonstration of how to solve customer issues in a multi-cloud deployment with Stackdriver APM and other tools supported by OpenCensus.

Improving Reliability with Error Budgets, Metrics, and Tracing in Stackdriver (Cloud Next '18)

Members of the Stackdriver and Customer Reliability Engineering teams will demonstrate how Stackdriver tooling inspired by the needs of SREs at Google brings you the ability to run services more reliability and with fewer false positive signals through tracking and alerting upon error budgets and debugging with the exemplar technique during an outage.