Operations | Monitoring | ITSM | DevOps | Cloud

July 2023

The Unplanned Show, Episode 7: Death of the Single Security Pane of Glass with Heather Hinton

In this episode, Heather Hinton describes how security teams can evolve away from spending cycles on “silly little jobs” and scouring multiple sources to try to identify the kinds of unplanned interrupt work that needs to be dealth with urgently. Instead, they can complete projects faster and take on more because on-call rotations are spent getting work done (with the occasional interruption) instead of “seeking” for the interrupt work. We also discuss how this fits in with encouraging broader employees to participate in security hygiene practices.

How to Maximize Time Savings and Reduce Toil During Incident Response

Incidents are a costly burden on businesses. Despite assembling the right people and teams, the manual work, tool setup and prolonged tasks can negatively impact customer experience. The need for adaptable processes to address diverse incident types further complicates the situation. This is where the PagerDuty Operations Cloud steps in. It streamlines and automates all the various manual steps in the incident response process.

Failure Fridays at PagerDuty

Rich Lafferty, Staff SRE at PagerDuty and Stevenson Jean-Pierre, Senior Manager, Software Engineering at PagerDuty join Mandi Walls to talk about PagerDuty’s Failure Friday and Failure Any Day practices. PagerDuty has been using failure injection and chaos engineering methods to maintain the reliability of production services. Rich and SJP joined the PagerDuty live stream to talk about how the process works, how it has evolved, and how failure helps improve PagerDuty’s services.

The Unplanned Show, Episode 6: Defining AIOps with Heather Newburn

“AIOps” is a term some love to hate, but what makes it useful? In this episode, Heath Newburn breaks down the three things to look for in an AIOps solution: reduce noise, create context, and reduce toil. He also explains the challenges with domain-specific approaches, versus domain-agnostic approaches to AIOps. But even within that approach, Heath warns of “gotchas” in rules “tech debt”, data formats, and overall long implementation times.

10 Years of Failure Friday at PagerDuty: Fostering Resilience, Learning and Reliability

In today’s fast-paced and ever-evolving world of technology, failure is inevitable. Organizations should embrace failure as a learning opportunity for how to build and deliver more resilient services. At PagerDuty, we’ve practiced Failure Friday for 10 years now. Failure Friday–a practice inspired by the chaos engineering space–involves intentionally injecting failures into our systems to improve reliability and foster a proactive engineering culture.

What's New in PagerDuty iOS and Android Mobile Applications

The PagerDuty Operations Cloud is your platform for action in critical moments. By harnessing the capabilities of AI and automation, it has the ability to detect and diagnose disruptive incidents, assemble the appropriate team members for prompt response, and optimize your digital operations by streamlining infrastructure and workflows.

Gartner Market Guide: Embedding Automation Into the Enterprise

“Existing workload automation strategies are unable to cope with the expansion in complexity of workload types, volumes and locations driven by evolving business demand, as per Gartner. Digital business is slowed without collaboration and automation inside and outside of IT, leading to siloes of capabilities across business and IT teams.Cost optimization is an evolving challenge, driven by technical debt and requirements to demonstrate business value of investments.”

The Unplanned Show, Episode 5: DataOps with Snowflake

Long gone are the days when data is batch loaded into a data warehouse for business intelligence reports that are looked at periodically and if something is broken, a few internal people would have to wait. Today, data pipelines are “infinitely more complicated”, with more sources from cloud services to on premises systems, and supporting data applications that are critical parts of a business’ ecosystem.

The Unplanned Show, Episode 4: Sriram Subramanian on Responsible Generative AI

Generative AI is a rapidly-evolving ecosystem with a lot of attention. In this episode, Dormain Drewitz asks Sriram Subramanian about the main challenges to responsibly implement generative AI, including content that’s harmful, inaccurate or violates privacy or security standards. Sriram discusses Microsoft’s 6 tenets to responsible generative AI, as well as the notion of shared responsibility between platform providers and foundational LLMs and the developers and data engineers building on top. Sriram also answers questions about where to get started safely with generative AI and shares his framework for identifying opportunities to add value.

PagerDuty Extends Operations Cloud Leadership into AIOps and Automation

Forrester Names PagerDuty a Leader in first-ever Process-Centric AIOps Wave From helping pioneer the DevOps movement to establishing best practices around service ownership to being the standard in incident response, PagerDuty has a long history of leadership. PagerDuty is honored to add to this list and now be recognized as a leader in the AIOps and Automation space by Forrester.