Operations | Monitoring | ITSM | DevOps | Cloud

Demo Roundups! Identifying System Weaknesses to Improve Resilience

How do you proactively identify weaknesses before they lead to costly incidents? Find out how PagerDuty empowers teams to uncover vulnerabilities, streamline incident response, and enhance operational performance to build more resilient systems. Host: Mandi Walls, DevOps Advocate at PagerDuty Guests: Alex Nauda, CTO Nobl9; Rich Lafferty, Principal SRE at PagerDuty.

[Webinar] Drift Happens! 3 Kubernetes Drift Scenarios & How to Overcome Them

Many organizations don’t realize the impact of Kubernetes drift until they’re managing multiple clusters and face skyrocketing costs, delayed fixes, and downtime. Drift can lead to inefficiencies and even critical security risks. That’s why it’s crucial to understand and proactively address drift at scale.

How does observability enhance operations in cloud native voice networks?

The 17th century saw the onset of the Scientific Revolution. It was a time of knowledge explosion. During this time, scientific practices were still evolving, with no universally accepted protocols for data collection and analysis. Individuals documented any observation, resulting in the collection of massive amounts of information, but much of it was hard to leverage to draw useful conclusions. The development of the scientific method provided a common approach to capturing and analyzing data.

The hitchhiker's guide to infrastructure modernization

One of my favourite authors, Douglas Adams, once said that “we are stuck with technology when what we really want is just stuff that works.” Whilst Adams is right about a lot of things, he got this one wrong – at least when it comes to infrastructure. As our Infra Masters 2025 event demonstrated, infrastructure is the technology that makes everything work – from managing a satellite in outer space, to, say, livestreaming an event.

Build a scalable internal developer portal with Backstage and CircleCI

Internal developer portals (IDPs) have become essential tools in platform engineering, helping standardize developer workflows and reduce friction by providing self-service access to tools, APIs, and infrastructure. During my time on a platform team, I experienced firsthand the transformative power of IDPs. Our team implemented custom solutions that significantly reduced load on developers, allowing them to focus on writing code rather than navigating complex infrastructure.

How to Build a Successful SIEM Migration Strategy

At least once a week, a team reaches out to discuss migrating from an established SIEM or analysis platform. This major decision is influenced by several compelling factors, which can create significant work for engineering teams and pose risks to the business. The cost of switching to a new platform, often referred to as displacement costs, can be substantial.

Responsible AI: What It Means & How To Achieve It

The information age has leapt forward with the explosive rise of generative AI. Capabilities like natural language processing, image generation, and code automation are now mainstream — driving the business goals of winning customers, enhancing productivity, and reducing costs across every sector. New large language models are emerging almost daily, existing language models are optimized in a frantic race to the top. There seems no stopping the AI boom.

Data Backup Strategies: The Ultimate Guide

Despite the nonstop warnings, millions of users still gamble with their data. A 2023 survey by Acronis revealed that 41% of people rarely or never back up their digital files, and businesses aren’t much better. Fewer than 20% of businesses back up their SaaS data, even though tools like Google Workspace and Microsoft 365 don’t guarantee full recovery after a loss or attack. The consequences?