Operations | Monitoring | ITSM | DevOps | Cloud

Latest Videos

Failure Fridays at PagerDuty

Rich Lafferty, Staff SRE at PagerDuty and Stevenson Jean-Pierre, Senior Manager, Software Engineering at PagerDuty join Mandi Walls to talk about PagerDuty’s Failure Friday and Failure Any Day practices. PagerDuty has been using failure injection and chaos engineering methods to maintain the reliability of production services. Rich and SJP joined the PagerDuty live stream to talk about how the process works, how it has evolved, and how failure helps improve PagerDuty’s services.

The Unplanned Show, Episode 6: Defining AIOps with Heather Newburn

“AIOps” is a term some love to hate, but what makes it useful? In this episode, Heath Newburn breaks down the three things to look for in an AIOps solution: reduce noise, create context, and reduce toil. He also explains the challenges with domain-specific approaches, versus domain-agnostic approaches to AIOps. But even within that approach, Heath warns of “gotchas” in rules “tech debt”, data formats, and overall long implementation times.

The Unplanned Show, Episode 5: DataOps with Snowflake

Long gone are the days when data is batch loaded into a data warehouse for business intelligence reports that are looked at periodically and if something is broken, a few internal people would have to wait. Today, data pipelines are “infinitely more complicated”, with more sources from cloud services to on premises systems, and supporting data applications that are critical parts of a business’ ecosystem.

The Unplanned Show, Episode 4: Sriram Subramanian on Responsible Generative AI

Generative AI is a rapidly-evolving ecosystem with a lot of attention. In this episode, Dormain Drewitz asks Sriram Subramanian about the main challenges to responsibly implement generative AI, including content that’s harmful, inaccurate or violates privacy or security standards. Sriram discusses Microsoft’s 6 tenets to responsible generative AI, as well as the notion of shared responsibility between platform providers and foundational LLMs and the developers and data engineers building on top. Sriram also answers questions about where to get started safely with generative AI and shares his framework for identifying opportunities to add value.

The Unplanned Show, Episode 3: LLMs and Incident Response

A software engineer, a data scientist, and a product manager walk into a generative AI project… Using technology that didn’t exist a year ago, they identify a customer pain point they might be able to solve, build on teammates’ experience with building AI features, and test how to feed inputs and constrain outputs into something useful. Hear the full conversation here.