Operations | Monitoring | ITSM | DevOps | Cloud

SRE Report Retrospectives - Have AIOps Predictions Held Up?

Welcome to a new blog series where we take a candid look at the predictions, insights, and bold claims we've made in previous SRE Reports and ask the uncomfortable question: How did we do? For the uninitiated, Catchpoint's SRE Report is our annual, practitioner-driven effort to capture the pulse of the global reliability community.

When BGP becomes UX: The inside story of a SaaS routing decision gone wrong (or right)

Most operations teams trust their green dashboards. If the internal monitoring says everything is healthy, the app must be fine, right? But as the Internet keeps proving, what’s green inside the firewall can look red for customers outside of it. Sometimes, a single change in how web traffic moves can suddenly slow logins, disrupt websites, or hurt business results, even if everything looks fine inside.

Introducing Catchpoint Session Replay: See Digital Experience Through Your Users' Eyes

When was the last time you really saw what your customers experience on your site? We're excited to introduce Session Replay, a new capability in our Internet Performance Monitoring (IPM) platform that lets you step directly into the user's journey. Session Replay is so much more than a platform upgrade. It’s an opportunity to understand, fix, and even prevent the issues that lead to churn, missed conversions, and frustrated users, all from their point of view.

How AI Turns Monitoring From "What Now?" Into "What's Next?"

It's 3 AM. Your phone starts buzzing with alerts, and you stumble to your laptop only to be greeted by a dashboard that looks like the control panel of a nuclear reactor in meltdown: Red lights everywhere. Numbers that should be green are decidedly not green. And your brain, still foggy from sleep, is asking the most fundamental question in all of IT operations: "Okay, yes, there's clearly a problem... but, now what?".

Making the invisible visible: Are your cloud firewalls and DDoS protection really working?

Every business builds strong defences to keep attackers out. Firewalls and DDoS protection serve that purpose, standing guard over company apps and websites, like knights at the castle gate keeping out trolls (not just the ones on X). But here’s the problem: those defences only work if users actually walk through the front gate. Sometimes, people find hidden paths or side doors around your walls, so the guards never see them enter.

APM vs Observability: Observing beyond APM

In my previous post I made a bold, sweeping statement that APM is not - in the most specific sense - a subset of observability. Still standing by it I stand by that because words matter and - like many "monitoring engineers" (IT folks who make monitoring and observability their specialty) - I, too, bear scars from the flame-wars on Twitter back in the 2020's where we fought internecine battles over the proper definition of (and number of pillars in) “observability”.

Why it's time to move beyond APM: Monitoring from the user's perspective

For years, organizations have relied on Application Performance Monitoring (APM) as the backbone of their observability strategy. The idea was simple: collect as many logs, metrics, and traces as possible, then sift through the data to uncover insights. But as applications have shifted to the cloud and become increasingly API-driven, that model has broken down.

When metrics mislead: Inside the 2025 Retail Web Performance Benchmark

Over the past few years at Catchpoint, we’ve benchmarked the digital performance of banks, airlines, hotels, travel aggregators, GenAI platforms, athletic footwear brands, and even ad hoc events like the Super Bowl, Olympics, and Election Day. Each time, our approach focused on the technical metrics performance professionals live and breathe: DNS resolution times, Time to First Byte, page load speeds, and six other core measurements that we'd dissect, analyze, and use to rank companies.

The vendor trap: why your next outage won't be your fault-but will be your problem

Today’s enterprises don’t run on singular self-contained systems—they’re intricate webs of interdependence: cloud services, APIs, CI/CD tools, DNS, CDNs, SASE vendors, identity management providers, cloud interconnects, ISPs, SaaS applications, application components, microservices, etc. A recent industry survey found that 84% of organizations suffered operational disruption from third-party risk incidents, with 66% facing adverse financial impact.

From SEO to AEO: Why Web Performance Is the Key to AI Search Success

Search isn’t what it used to be. The way people discover information online is shifting. Instead of clicking through search results, many now ask AI answer engines like ChatGPT and Perplexity to do the research for them. In March 2025, 13.1% of Google desktop searches featured AI Overviews— doubling from over 6% in January, according to Semrush analysis of 10+ million queries.