Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Microsoft Teams outage - December 10th, 2025

On the morning of December 10, 2025, Microsoft Teams experienced a service disruption affecting users across Australia. Although Microsoft 365 users reported issues across several apps, the hardest hit service was Microsoft Teams which became completely unusable for many organizations. While Microsoft did not acknowledge the incident until 03:46 UTC StatusGator identified the issue at 02:52 UTC through incoming outage reports and delivered an Early Warning Signal at 03:01 UTC.

What Is IT Incident Response?

“We’ve got a new alert – have you seen it yet?”“Which one? The CPU spike or the unusual login?”“The login. Same region as yesterday. But the CPU thing looks suspicious too.”“…Alright, I’ll check the firewall logs. You take the containers.”“Perfect. Let’s hope this doesn’t turn into another all-hands situation.” Does this conversation sound familiar?

Every Business Needs a Robust Incident Response Strategy

In today's digital landscape, businesses face an increasing number of cyber threats that can compromise sensitive data, disrupt operations, and tarnish their reputation. As companies adopt more complex technological solutions, they must be prepared for the inevitable risk of security incidents. Having a well-established, effective incident response strategy is no longer optional but essential. This article explores why incident response solutions are critical for every business and how they play a pivotal role in safeguarding an organization's assets, reputation, and continuity.

When major IT incidents occur, AI can deliver speed and transparency

The recent Cloudflare outage served as a stark reminder of how fragile the global digital ecosystem can be due to a single point of failure. In a matter of minutes, thousands of websites that rely on Cloudflare’s CDN, from Fortune 500 brands to SaaS platforms and consumer apps, went offline for hours. The business impacts were severe, with Shopify alone suffering over $4 million in losses while downstream merchant impacts potentially exceeded $170 million.

New features: AI SRE, Merge alerts, and Status pages for thousands of services

As we head into the holiday season, the ilert team is doing the opposite of slowing down; we’re ramping up. Over the past weeks, we’ve shipped a wave of impactful improvements across alerting, AI-powered automation, mobile app, and status pages. From major upgrades that reshape how teams triage incidents to smaller refinements that remove daily friction, this release is packed with updates designed to make on-call and operations smoother, smarter, and faster. Let’s dive in.

Shopify Outage 2025: Rise of the Commerce Kaiju

It was a normal day in the land of eCommerce. Birds were singing, dashboards were loading, and merchants everywhere felt cautiously optimistic. Then the ground trembled. A tiny glitch. A flicker. A warning log no one read. And suddenly— BOOM! Shopify burst out of the digital ocean like a gigantic scaly beast that woke up on the wrong side of the server rack. Checkouts froze mid-purchase. Product pages stopped producting. Merchants stared blankly at blank screens. The Commerce Kaiju had arrived.

Cloudflare was down again: Here's what happened.

On December 5, 2025, the internet faced another major disruption – the second significant Cloudflare-related outage in just a few weeks. A similar widespread incident occurred on November 18, which we covered in detail in our post The internet broke again – StatusGator can help. Today’s outage reinforces how quickly issues within core internet infrastructure can ripple outward and impact thousands of services simultaneously.

Towards a more resilient StatusGator

Between October 20 and December 5, 2025, a rapid succession of major outages across multiple cloud providers disrupted large portions of the internet. Each of these events affected StatusGator in different ways. After each incident, we implemented improvements to strengthen our reliability. This post summarizes the impact of each outage, the changes made, and the architectural work now underway to ensure StatusGator remains available during the moments when it is needed most.