Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Risky Business: Implementing a Redundant Networking and Multi-CDN Monitoring Strategy

Last month, we partnered with AWS to put together a webinar on the importance of implementing a comprehensive redundant networking and multi-CDN monitoring strategy. You can replay the event in full here. In this article, we’ll recap the key takeaways covered by the panel of experts who included Leo Vasiliou, Director of Product Marketing at Catchpoint, and Steve Campbell, our Chief Strategy Officer.

Incident Review - Rolling Comcast Outage Disrupts Work from Home for Millions of Users Across the U.S.

The rolling Comcast outage on Monday, November 8th and Tuesday, November 9th affected customers across the U.S., knocking users offline around the country. The first wave took place Monday evening in the San Francisco Bay area. The second, which had a wider geographic impact, occurred Tuesday morning, primarily affecting broad swathes of the Midwest, Southeast, and East Coast.

The eCommerce Holiday Calendar for DevOps

Seasonal spikes in consumer activity are expected, if not depended on, by online retailers throughout the calendar year. However, as shoppers rush to compete over door-buster deals and order holiday must-haves, web traffic escalates to levels standard resource allocation cannot easily sustain. This spike in traffic can lead to unresponsive checkouts, lost or abandoned carts, and slow-loading pages, ultimately resulting in thousands of dollars in lost revenue.

Catchpoint Ushers In A New Era Of Visibility With The Addition Of 5G Mobile Edge Nodes

From its inception, Catchpoint has been a pioneer in terms of observability and its ability to deep scan infrastructures and protocols that bind the Internet. Our industry-leading observers gather in-depth data, providing the broadest coverage across wireless, cloud, backbone, and last mile networks. That data arms people across the enterprise with the information they need to provide a superior digital experience.

Changes In Technology: How Catchpoint Monitors And Observes The Internet, From 2008 To Today

Last month, Catchpoint celebrated its 13th birthday, a milestone which has us feeling more than a little nostalgic. As we embark on our teenage years as a company, we have also been looking back and reflecting on all the changes the world of technology has seen since Catchpoint was “born” back in 2008. The world looks very different today than it did at our founding and, for that matter, so do we!

VMworld 2021: Automation, Elastic Edge, And The Increasing Importance Of User Experience

During this year's VMworld, we announced that our solution Catchpoint Digital Experience Monitoring is now also available for purchase on VMware Marketplace. It is easier than ever for our customers to access, deploy, and start using Catchpoint solutions to realize and achieve their business goals.

Site Reliability Engineering: Top SRE Tools As Voted On By SREs

Catchpoint is proud to present the top SRE tools as voted on by SREs. In our fourth annual SRE Survey, compiled in partnership with VMware Tanzu Observability and DevOps Institute, we simply asked, “What are a few tools that every SRE should have available in their toolbelt?” Today, we are excited to share the findings with you. While some of the answers were not strictly tools, the analysis gives us valuable insight into the mindset of an SRE.

Incident Review - An Account Of The Telia Outage And Its Ripple Effect

Another major outage on the Internet has taken place today. Telia, a major backbone carrier in Europe, suffered from a network routing issue between 16:00 and 17:05 UTC. This had a huge ripple effect, causing issues for multiple key companies providing critical cloud and infrastructure services. Companies affected include: - Google Cloud - Equinix Metal - Cloudflare - Fastly - NS1 It’s always arresting to see the secondary and tertiary effects that a major outage can have.

Incident Review For the Facebook Outage: When Social Networks Go Anti-social

The following is an analysis of the Facebook incident on 10/4/2021. Marking a highly unusual state of events, Facebook, Instagram, WhatsApp, Messenger, and Oculus VR were down simultaneously around the world for an extended period of time Monday. The social network and some of its key apps started to display error messages before 16:00 UTC. They were down until 21:05 UTC, when things began to gradually return to normality.

Incident Review - Slack Outage Impacts A Subset Of Users Worldwide Due To DNS Issue

DNS observability is an essential part of any Ops team’s strategy. Looking for proof? It’s happening right now. It has been a busy week for Ops teams across the globe. Many were forced to urgently rotate SSL certificates after one of Lets Encrypt’s root certificates expired. Collaboration plays a critical role during such situations where members in a team or multiple teams must communicate and work with each other to rapidly and efficiently complete a collective task.