Editor’s note: Today we hear from Kenny Kon, an SRE Director at Sabre. Kenny shares about how they have been able to successfully adopt Google’s SRE framework by leveraging their partnership with Google Cloud. As a leader in the travel industry, Sabre Corporation is driving innovation in the global travel industry and developing solutions that help airlines, hotels, and travel agencies transform the traveler experience and satisfy the ever-evolving needs of its customers.
Prometheus, the de facto standard for Kubernetes monitoring, works well for many basic deployments, but managing Prometheus infrastructure can become challenging at scale. As Kubernetes deployments continue to play a bigger role in enterprise IT, scaling Prometheus for a large number of metrics across a global footprint has become a pressing need for many organizations.
The need for relevant and contextual telemetry data to support online services has grown in the last decade as businesses undergo digital transformation. These data are typically the difference between proactively remediating application performance issues or costly service downtime. Distributed tracing is a key capability for improving application performance and reliability, as noted in SRE best practices.