Operations | Monitoring | ITSM | DevOps | Cloud

Trace without traces

A customer emailed on a Tuesday: checkout hung for ten seconds. I opened our tracing tool, punched in the time window, and got nothing. The trace was sampled out. We keep 1% of traces, like most shops with real traffic do. The one request that actually mattered was in the 99% we threw away. I spent twenty minutes admiring our observability stack before admitting it couldn’t answer a first-grader’s question: what happened to this person? Here’s what I know now.

June 2026 Early Warning Signals

June 2026 saw major outages across ecommerce, AI, developer tools, and business applications. StatusGator’s Early Warning Signals surfaced many of these incidents before providers updated their official status pages. Of the 1,067 incidents detected by StatusGator in June, only 191 (17.9%) were eventually acknowledged by providers.

Introducing relationships for Service Monitors

Understanding a service outage is easier when you can see what it’s connected to. That’s why we’re introducing Relationships for Service Monitors, one of the most requested features from StatusGator’s hundreds of enterprise IT teams. You can now explore related services directly from the Service Details page by opening the Relationships dropdown.

Any Apple update can break our app. Here's how we find out first.

This is a guest post by Dan Mindru, a Frontend Developer and Designer who is also the co-host of the Morning Maker Show. Dan is currently developing a number of applications including PageUI, Clobbr, and CronTool. It feels like with every release, we are walking a tightrope. We need to keep our app lightweight, stable, and performant, all the while depending on APIs that can shift at any moment (without warning, too!).

Self-Healing ITOps: Close the Loop From Detection to Resolution

Self-healing ITOps helps restore services faster by combining AI-driven analysis, automation, and recovery validation. Organizations have invested heavily in monitoring, observability, and AIOps. These platforms are effective at identifying issues, but incident resolution is often still a manual process. Engineers still need to investigate alerts, determine the appropriate remediation, and verify that services have recovered.

When One Agent Plans and Another Executes, the Planner's View Decides Everything

Split network operations into a planning agent and an executing agent and you have an elegant design on paper. One agent reasons about what should change and validates it. The other carries it out. The elegance is real, and so is the structural consequence: the split puts the entire weight of judgment on the planner. A plan built on a partial view, then executed precisely and at machine speed, is more dangerous than a cautious human who would have hesitated at the part that did not add up.

How Liftoff cut costs by 87% and latency by 75% with HAProxy

Liftoff, a mobile advertising company, processes 1.5 trillion bid requests every month. Their platform touches 275 million unique devices daily across 150 geographies. At that scale, the proxy layer is a core part of the business. For years, Liftoff relied on a managed enterprise proxy vendor. It worked, until it didn’t.

New in Skylar One - Kyoto: Better Context for Faster, More Confident IT Operations

Modern IT environments do not fail in neat, isolated ways. A network issue in one location can affect a business service somewhere else. A device alert may be the first sign of a larger dependency problem. And when teams are managing infrastructure across data centers, cloud, branches, campuses, and edge environments, the first challenge is often knowing where to look first. The issue is not alert volume alone. It is the missing context between telemetry, service impact, probable cause, and action.

Rewiring Operations for the Agentic Era: The 4 Decisions on the CEO's Desk

For two years, the enterprise's question about AI was which model to buy. That question is already settling. Frontier capability is becoming abundant - rentable by anyone, swappable in an afternoon, and roughly identical in your hands and your competitor's. An advantage everyone can buy is not an advantage. What can't be bought is the thing underneath it: a system that has learned how your business actually works - the intelligence your enterprise accumulates and no competitor can replicate.