New York, NY, USA
2008
  |  By Denton Chikura
For eight years, the survey behind the SRE Report has used a consistent methodology. That consistency allows us to track how reliability work evolves over time, rather than relying on snapshots. One of the most stable questions in the survey asks respondents to estimate how much of their work, on average, is spent on toil. Between 2020 and 2024, responses showed a gradual decline in reported toil.
  |  By Leo Vasiliou
Reliability used to mean “are we up?” Today, customers ask something more demanding: “Are you fast, everywhere, every time?” The SRE Report 2026 shows that this change is no longer emerging. It is already established.
  |  By Leo Vasiliou
You shouldn’t have to understand the care behind this report, unless it’s missing. For the past eight years, this research has focused on all things related to reliability and resilience. How systems behave under stress. How teams respond when things break. And how the practices continue to evolve. Reaching the eighth edition of The SRE Report attests to that and gives me pause. You can read the full report here and you can find a summary of the key findings here.
  |  By Denton Chikura
This is the eighth edition of the SRE Report. Eight years of tracing reliability's arc, from uptime obsession to experience, from toil to intelligence, from systems to people. This year's report is also the first since Catchpoint joined LogicMonitor. We want to acknowledge their support in keeping this work going. They get what this report means to the reliability community, and that matters. We made a deliberate choice this year to say less.
  |  By Gerardo Dada
In modern observability practices, distributed tracing has become table stakes. Most application performance monitoring (APM) platforms encourage an “instrument everything” approach: Deploy an SDK or agent, hook into every service call and capture every user interaction at scale. On paper, this sounds like complete visibility. In practice, it can turn into a costly firehose of data with diminishing returns.
  |  By Catchpoint Team
On December 15, 2022, Catchpoint launched Internet Performance Monitoring (IPM) as a new category for monitoring solutions with our foundational article, “What is Internet Performance Monitoring and How is it Different from APM?” In it, we said: How prophetic those words turned out to be.
  |  By Mehdi Daoudi
In 2008, I was sitting in my garage office with a simple but stubborn idea: the Internet deserved better. End users deserved better. Companies needed a way to truly understand what their customers were experiencing, not just what their servers were reporting. Digital Experience Monitoring wasn’t a category yet. But the need was unmistakable. That idea didn’t come from theory or ambition. It came from lived experiences.
  |  By Mehdi Daoudi
Another day, another massive Internet disruption, and this time it’s Cloudflare taking huge parts of the Internet offline. This incident is not an anomaly. It is part of a recurring pattern that has become standard in digital infrastructure. We have reached an inflection point in digital operations. Outages at major cloud and content delivery network (CDN) providers are now expected. The only real uncertainty is when it will happen next.
  |  By Payal Chakraborty
We recently hosted our first-ever Peak Performance Summit in Bangalore, India, a one-day event focused on how value-based observability drives digital business outcomes. The summit brought together customers, partners, and technology leaders to share real-world experiences, live demos, and forward-looking ideas. The message running through every session was clear: performance isn’t just about speed. It’s about measurable business results.
  |  By Leon Adato
Remember how I said that blog was going to be my last entry on the topic of "APM vs Observability?" Well, it turns out I had a little more to say. I'd like to spend a few moments talking about the future of APM and Observability. I think it comes down to two major initiatives: AI and Open Telemetry. (NOTE: in this section, I'm using the word "observability" to refer to the discipline of monitoring and observability as a whole, rather than any specific tool, technique, or vendor-based solution.)
  |  By Catchpoint
Catchpoint’s new AI Advisor discovers unknown unknowns to improve your performance monitoring by seeing what you might’ve missed.
  |  By Catchpoint
Get detailed data on how real users are using your applications in real time with Catchpoint’s Session Replay, with a user’s session replayed like a movie.
  |  By Catchpoint
Leadership is about more than telling people what to do. It’s about inspiring belief in your vision for the future. Sometimes there’s a delay between the time you share the vision and when the rest of the team “gets it”. The Latency & Leadership series hopes to shorten that lag time by creating a platform for leaders in the tech space to share their ideas, their passion, and their vision.
  |  By Catchpoint
Introducing Catchpoint’s brand new AI-driven automatic root cause analysis tool! Stack Map Root Cause Analysis AI Insights.
  |  By Catchpoint
This month, we’re all about those features, baby! From AI-driven automatic root cause analysis; to playing back user RUM sessions like a movie; to discovering unknown unknowns with AI-driven advisors, Catchpoint has what you need to improve your IPM.
  |  By Catchpoint
An eCommerce platform, a banking app, even a simple user portal depends on a web of APIs, cloud tools, hosting services, and edge networks. Each one introduces another potential point of failure. And when those dependencies break? User experience suffers. Brand trust takes a hit. Millions in revenue are at risk. That’s why leading digital businesses, especially in eCommerce and banking, are expanding visibility beyond the application stack.
  |  By Catchpoint
Modern digital services rely on complex systems, and chaos can strike at any layer. But the most effective teams don’t wait for failure to learn. They simulate it. By introducing controlled performance degradations, you can stress your systems, test your dependencies, and uncover hidden risks without touching production. In our latest webinar, Catchpoint experts walk through how teams are building resilience through proactive, safe failure testing, and why it’s become a cornerstone of digital reliability.
  |  By Catchpoint
If you’re monitoring your applications, you’re missing what your customers are actually seeing. Performance issues don’t happen in a vacuum. They happen at the edges, on mobile devices, over congested networks, in last-mile dead zones. Monitoring only works when it’s aligned with reality. And reality starts at the user.
  |  By Catchpoint
Behind the Dashboard is an ongoing series where we look under the hood of a specific Catchpoint feature. Each episode breaks down the technology itself, what’s challenging about using it for monitoring, and how we removed friction and toil to make it a valuable part of the Catchpoint platform. In this episode Leon, Mursi, and Rahul take a look at Catchpoint’s LLM monitoring capabilities, including ensuring your integrated LLMs are up and performing optimally; as well as knowing if you’re using the most effective (accurate) and economical (cheapest per query) option in your suite.
  |  By Catchpoint
We talk a lot about the application stack, the code and services you build. However, just as critical is the infrastructure that delivers that code to your users. That’s the Internet Stack: a complex chain of technologies and services, from DNS and BGP to CDNs and ISPs, that every digital experience depends on. It’s separate from your application stack. It’s different for every user, in every geography. And most importantly, it still impacts your users—even if you don’t directly own it.
  |  By Catchpoint
Now in its fifth year, The SRE Report has become the trusted source of trends and insights for reliability-as-a-feature practices. This year in partnership with Blameless, the report contains special contributions from Adrian Cockcroft and Steve McGhee and highlights findings from a global community of reliability practitioners, including SREs, managers, architects, and executives. As ever, we found some familiar trends and some thought-provoking anti-patterns.
  |  By Catchpoint
In-depth analysis of key Internet outages across the past 18 months, from AWS to Facebook; includes six critical lessons for IT teams to improve Internet Resilience.
  |  By Catchpoint
The Border Gateway Protocol (BGP) is the primary protocol for how packets are routed across the internet. It's one of the pillars of the digital world, but one which comes with serious vulnerabilities.
  |  By Catchpoint
A 2023 commissioned study conducted by Forrester Consulting Internet Performance Monitoring avoids Internet disruptions and mitigates risk for a successful eCommerce business.
  |  By Catchpoint
Supporting an anywhere, anytime, hybrid workforce is now a top priority as hybrid work continues to be the new norm. IDC's new Spotlight Paper provides valuable insights and actionable recommendations for implementing a robust and resilient employee experience strategy to ensure your workforce stays connected, engaged, and productive. Download the paper to gain a better understanding of.

Catchpoint is the Internet Resilience Company™. The top online retailers, Global2000, CDNs, cloud service providers, and xSPs in the world rely on Catchpoint to increase their resilience by catching any issues in the Internet Stack before they impact their business. Catchpoint's Internet Performance Monitoring (IPM) suite offers synthetics, RUM, performance optimization, high fidelity data and flexible visualizations with advanced analytics. It leverages thousands of global vantage points (including inside wireless networks, BGP, backbone, last mile, endpoint, enterprise, ISPs and more) to provide unparalleled observability into anything that impacts your customers, workforce, networks, website performance, applications, and APIs.