Operations | Monitoring | ITSM | DevOps | Cloud

Where did all my Claude Code tokens go?

Most teams judge their AI coding agent on two things: the monthly bill and a feeling. The bill tells you what you spent and the feeling tells you whether it seems to be helping, but neither one tells you what the agent actually did. As these tools move into the critical path of how software ships, that gap is starting to matter. I wanted to replace the feeling with something I could measure and understand what shapes of work affects this bill, so I decided to run an experiment on myself.

The AI bill arrived. Now what?

There was a time when “Opus” meant a classical composition and “Sonnet” was fourteen lines of Shakespeare you definitely did not read before the test. Now they’re model tiers, and every new release rewrites the economics of your engineering org whether you’re ready or not. Currently, your monthly total hides the crucial information you need to control and justify AI spend.

The Data Plane Reality: OTel Scales, While Topology UX Lags

OpenTelemetry won the architectural standards battle. At scale, though, telemetry breaks more like plumbing than code. It breaks quietly, across a graph, with a blast radius you don’t understand until it’s expensive. With over 65% of organizations now running more than 10 collectors in production, hybrid deployments across Kubernetes and VMs are accelerating fast. Telemetry standardization is no longer a project milestone. It is a baseline expectation.

Un-observable AI is Un-trustworthy AI

Recently, someone talked Chipotle’s customer support agent into reversing a linked list – a task completely unrelated to burritos in any way. Screenshots circulated, people laughed, but underneath the joke sat a sharper question. If a production support agent will do that on a public channel, what else will it do that nobody is screenshotting? The bug is funny. The trust gap behind it is not.

DataPrime at ingest (DPXL): See the impact of any routing decision

TCO policies have always been one of the most impactful cost levers in Coralogix. Route business-critical data to High, push monitoring data to Medium, archive compliance logs to Low. With the addition of DataPrime expressions (DPXL) – a subset of the DataPrime query language designed for inline filtering at ingest – that routing became even more precise, matching on any field in the event payload, not just application, subsystem, and severity.

Explore for Spans: One View with Infinite Depth

It’s 20 minutes into a P0 incident, and you have already switched between four different tools, re-authenticated twice, and translated queries across three incompatible syntax languages. The root cause you are searching for. Well, that is still out there somewhere. The reality of investigative latency is that most engineering teams face navigation problems, not data problems. During high-pressure incidents, teams lose cognitive momentum due to context switching between disconnected telemetry silos.

New Explore: Faster answers, less friction, and a better way to investigate your data

There is a moment every engineer knows too well. Something is wrong in production. You have an alert, a vague symptom, and pressure to find the one signal that explains what changed. You open your logs and traces, and you immediately hit the same two problems: the dataset is huge, and the path from “I see something odd” to “I understand why” is full of tiny, exhausting steps. Meet new Explore, our redesigned investigation experience for logs, traces, and spans.

What Is APM? A Guide to Application Performance Monitoring

A well-instrumented service tells your on-call engineer which deploy broke checkout, which span ate the latency budget, and which line to revert before the support queue fills up. Getting there depends on how cleanly your application performance monitoring layer turns telemetry into answers. The sections ahead walk through how APM works, the metrics and components worth tracking, the cloud-native challenges at scale, and how to evaluate APM tooling against your real workload.