Operations | Monitoring | ITSM | DevOps | Cloud

How to Monitor AI Agents in Commerce Systems

Artificial intelligence (AI) isn’t just writing text or generating images anymore. It’s starting to make real-world decisions. Now, with agentic systems, we’re entering an era where AI models don’t just respond; they act autonomously, buying, booking, and negotiating on behalf of users. That may sound promising, but those of us in the trenches of reliability know that progress always comes with trade-offs. Make no mistake, this shift fundamentally changes how observability works.

The four pillars holding up your digital business, and what happens when they crumble

When we published the first Internet Resilience Report in 2024, the world was still reeling from the CrowdStrike outage that left airlines grounded and financial institutions scrambling. A year later, the stakes are even higher. The 2025 edition confirms what many of us already feel every day in IT Operations: resilience is no longer about uptime alone. It’s about protecting revenue, customer trust, and digital performance at scale.

When payments pause: lessons from a global payments outage

In digital commerce, payment reliability is non-negotiable. The rise of instant payments highlights this need: global instant payment transaction volume reached 195 billion in 2022, with projections to surpass 500 billion transactions by 2027 as more countries adopt faster payment systems. This growing reliance on real-time payment rails raises the stakes for reliability, with any disruption posing major risks to trust and revenue.

Observability 2025 Decoded: What the DZone Report Means for SLO-Driven Ops

DZone’s 2025 Intelligent Observability Trend Report captures a real inflection point: teams are shifting from “more data” to outcome-driven practices that improve resilience and accountability. The survey was gathered between August 28 and September 25, 2025, from a global pool of developers, architects, and IT professionals.

The next evolution of WebPageTest has arrived, and it's a game-changer

Now fully integrated into Catchpoint’s Internet Performance Monitoring (IPM) platform, WebPageTest is no longer just a testing tool; it’s your full-stack performance command center. From AI-powered insights to automation and Smartboards, the new WebPageTest gives digital experience teams everything they need to move beyond page speed and master end-to-end performance. Test smarter, detect faster, and optimize every layer of performance with a unified, AI-powered platform built for experts.

The Monitoring Blind Spot That Could Cost You Black Friday

With Black Friday and the holiday season looming, IT teams everywhere are bracing themselves for what is, year after year, the most daunting stress test of your entire service delivery chain. Under relentless peak demand, every link in your digital experience is scrutinized by customers whose tolerance for friction is at an all-time low. It’s not just about uptime, monitoring dashboards, or technical metrics.

AWS Outage: How do you prepare for the failure of your own safety net?

When AWS’s massive outage struck, it didn’t just take down cloud services, apps, and enterprise platforms. It also knocked out many of the monitoring systems organizations depend on for real-time answers. Observability companies, including Datadog, New Relic, Checkly, Dynatrace, SpeedCurve, and Splunk Observability, lost visibility or functionality precisely when organizations needed them most.

Powering Mexico's Digital Future: Expanded Internet Observability with Catchpoint

As of 2025, more than 110 million Mexicans are online, putting digital‐access penetration at roughly 83% of the population. Mexico is already one of Latin America’s anchor markets, leading the region in startup momentum, cloud adoption, and cross-border digital trade. A few days ago, CloudHQ announced a $4.6B investment in Mexico to open multiple datacenters. Yet even with this scale, service quality still varies dramatically across cities, states, and ISPs.

APM vs Observability: Both-and, not either-or

I'll start this, the third and final entry in my series on APM and Observability, which was originally inspired by my contribution to an APMdigest article, by once again pointing out that APM tools can be built with observability in mind. Many are, in fact. And the ones that aren’t don’t turn into a different type of tool. In my experience, it's more that there's a difference of mindset.