Operations | Monitoring | ITSM | DevOps | Cloud

Cloudflare outage: another wake-up call for resilience planning

Another day, another massive Internet disruption, and this time it’s Cloudflare taking huge parts of the Internet offline. This incident is not an anomaly. It is part of a recurring pattern that has become standard in digital infrastructure. We have reached an inflection point in digital operations. Outages at major cloud and content delivery network (CDN) providers are now expected. The only real uncertainty is when it will happen next.

Catchpoint Peak Performance Summit 2025: Redefining Observability for the Outcome Economy

We recently hosted our first-ever Peak Performance Summit in Bangalore, India, a one-day event focused on how value-based observability drives digital business outcomes. The summit brought together customers, partners, and technology leaders to share real-world experiences, live demos, and forward-looking ideas. The message running through every session was clear: performance isn’t just about speed. It’s about measurable business results.

APM vs Observability: What comes next?

Remember how I said that blog was going to be my last entry on the topic of "APM vs Observability?" Well, it turns out I had a little more to say. I'd like to spend a few moments talking about the future of APM and Observability. I think it comes down to two major initiatives: AI and Open Telemetry. (NOTE: in this section, I'm using the word "observability" to refer to the discipline of monitoring and observability as a whole, rather than any specific tool, technique, or vendor-based solution.)

How to Monitor AI Agents in Commerce Systems

Artificial intelligence (AI) isn’t just writing text or generating images anymore. It’s starting to make real-world decisions. Now, with agentic systems, we’re entering an era where AI models don’t just respond; they act autonomously, buying, booking, and negotiating on behalf of users. That may sound promising, but those of us in the trenches of reliability know that progress always comes with trade-offs. Make no mistake, this shift fundamentally changes how observability works.

The four pillars holding up your digital business, and what happens when they crumble

When we published the first Internet Resilience Report in 2024, the world was still reeling from the CrowdStrike outage that left airlines grounded and financial institutions scrambling. A year later, the stakes are even higher. The 2025 edition confirms what many of us already feel every day in IT Operations: resilience is no longer about uptime alone. It’s about protecting revenue, customer trust, and digital performance at scale.

When payments pause: lessons from a global payments outage

In digital commerce, payment reliability is non-negotiable. The rise of instant payments highlights this need: global instant payment transaction volume reached 195 billion in 2022, with projections to surpass 500 billion transactions by 2027 as more countries adopt faster payment systems. This growing reliance on real-time payment rails raises the stakes for reliability, with any disruption posing major risks to trust and revenue.