Operations | Monitoring | ITSM | DevOps | Cloud

Scaling Runtime Diagnosis System w/ Grafana Pyroscope | Roblox at ObservabilityCON on the Road 2024

In this video, Xiaofeng and Jialin from Roblox introduce their journey in building a robust runtime diagnostic system using Pyroscope. With over 70 million daily active users and 4.4 million creators contributing to the platform, ensuring reliability and efficiency is paramount. They discuss the challenges faced in debugging production issues and the manual, inefficient methods previously used. Through thorough investigation and collaboration with Grafana Labs, they developed an on-demand profiling workflow, enabling engineers to identify and address performance bottlenecks effectively.

Clinical troubleshooting with Dan Slimmon

It’s no secret that teamwork is one of those things that, when done right, can make a world of a difference. So sometimes, when responding to a particularly complicated incident, it can be best to bring a team together to figure out what’s going on and work towards a fix. But it’s not enough to just jam a bunch of folks into a room and hope for the best. You need a framework in place to ensure that everyone stays focused, diagnoses the issue and resolves it as quickly as possible.

How to Achieve Observability as Code with Grafana | LiveRamp at ObservabilityCON on the Road 2024

Leveraging Terraform alongside Grafana, Kubernetes, and Helm providers, the SRE team at LiveRamp has transformed every aspect of their operational toolkit. From agent installations and synthetic checks to Grafana k6 performance testing, notification policies, contact points, and alerts into modular, code-based components, the team is crafting a cutting-edge observability solution powered by Grafana Cloud. Learn how this seamless integration ensures a robust, scalable, and easily manageable infrastructure that is setting new benchmarks for system reliability and efficiency around the business.