Operations | Monitoring | ITSM | DevOps | Cloud

You don't need a paid plan to use AI Root Cause Analysis

When an error appears in production, the hardest part often isn’t seeing what broke. It’s understanding why. That’s why we built Root Cause Analysis (RCA). It helps connect the dots between an error and its likely cause, so you can spend less time investigating and more time moving forward. Until now, RCA was only available through plans that included AI credits. Starting today, free plan users can purchase an AI credit subscription and use RCA without changing plans.

Splunk Observability at Cisco Live: Agentic Observability for the AI Era

Observability has always been about seeing clearly under pressure. But the pressure has changed. Applications are more distributed. Kubernetes environments keep expanding. Digital experiences depend on services, APIs, networks, third-party providers, and now AI models and agents that can make decisions faster than a human team can review every signal.

From Detection to Resolution: Why ServiceNow + xMatters Is the Fastest Path to Incident Resolution

AI is changing incident management, but not in the way most people think. For years, operations teams focused on getting better at detecting problems. Monitoring improved. Observability improved. AI is now helping teams correlate signals, reduce noise, and identify issues faster than ever before. That’s all valuable, but many organizations are discovering that finding the problem is no longer the hardest part. The harder part is everything that happens next. Who owns the issue?

The AI ROI Company's new groove: CloudZero's new UI, and what it means for customers

Customizability. Feature velocity. Performance. Capabilities that are critically important to all B2B software users. And capabilities in which CloudZero’s brand-new platform specializes. Pitching a total frontend overhaul didn’t necessarily make me CloudZero’s most popular new PM. But it’s made CloudZero faster, more customizable for a wider range of personas, and easier to update with the new features that matter most to our customers. And, if I may say, it also looks beautiful.

Claude Opus 4.8: Pricing, benchmarks, and which model to actually run

Anthropic shipped Claude Opus 4.8 on May 28, 2026, exactly 41 days after Opus 4.7. The SERP was empty for two days after launch. Not because nobody cared. Because engineering managers and finance teams were doing the math on whether the bill changes.

Observability Summit NA 2026: What the Community Is Thinking About

Two days in Minneapolis with the OpenTelemetry community, talking about where telemetry pipelines are headed and what the AI wave is doing to them. Two topics dominated everything: AI and cost reduction. Not as separate conversations, either. The more the community talked about AI telemetry, the more the cost question followed right behind it. I joined Diana Todea from VictoriaMetrics and Antonio Jimenez Martinez from Cisco ThousandEyes on the Telemetry That Matters panel.

How LivePerson optimized Logstash and Kafka performance on GCP through benchmarking

By benchmarking five GCP machine types across both Logstash and Kafka, LivePerson's observability team found that infrastructure selection (not just pipeline configuration) is one of the highest-leverage cost optimization decisions at scale.

Code isn't cheap, but POCs are

I keep hearing the phrase "code is cheap." I don't know who came up with it. Whoever it was clearly has not seen an Anthropic bill. I get what they mean. The cost of writing a line of code has cratered, AI does most of the typing, you know the rest. Fine. But the phrase is combative in a way that doesn't help anyone, especially the engineers in the room. "Code is accessible" lands better. Less swagger, more honesty. Either way, here's the line my friend Guillaume gave me that finally cracked it open.

What is InfiniBand?

When distributed workloads stall because nodes cannot exchange small messages quickly and consistently, the network is the limiting factor. How do you solve that problem? InfiniBand offers one solution. InfiniBand is an interconnect, meaning the end-to-end communication system that links compute, storage, and accelerator nodes. It is implemented as a purpose-built network fabric, the switching and transport layer engineered to deliver high bandwidth and low, predictable latency between those nodes.