Operations | Monitoring | ITSM | DevOps | Cloud

Claude Code + OpenTelemetry: Per-Session Cost and Token Tracking

I was looking at our Claude Code spend in the Anthropic console the other day. Aggregate cost, aggregate tokens — no breakdown by developer, no breakdown by session. I knew my Hackathon team had been using it heavily on building out new features for the OpenTelemetry Distro Builder. But heavily how? I had no idea. Turns out Claude Code has been emitting OpenTelemetry signals the whole time. Per-session cost, token counts, every tool call it makes on your codebase.

What is OpenTelemetry and Why Do Organizations Use it?

Mining for information about environments is like trying to find gold. Looking for gold can be sifting through silty waters or blasting through a mine. In some cases, the gold nuggets are so small as to be almost invisible, some things look like gold but aren’t, and others are larger nuggets where the miner strikes it rich. Trying to understand how a distributed system works means sifting through vast amounts of telemetry, looking for patterns.

Turn Raw Data into Reliability by Changing Performance Perspectives

In a global microservices architecture, technical performance initially presents as a chaotic stream of disconnected telemetry. For a Technical Program Manager (TPM), success depends on the ability to move past these disconnected individual data points to identify stable patterns. If they have services entering critical states, looking at individual logs or traces is inefficient. Protecting system reliability requires an engine that can automate pattern recognition at scale.

OpenTelemetry Production Monitoring: What Breaks, and How to Prevent It

OpenTelemetry almost always works beautifully in staging, demos, and videos. You enable auto-instrumentation, spans appear, metrics flow, the collector starts, and dashboards light up. Everything looks clean and predictable. However, production has a way of humbling even the most carefully prepared setups. When real traffic hits, and it always spikes sooner or later, you start seeing dropped spans.

OpenTelemetry support for .NET 10: A behind-the-scenes look

At Grafana Labs, we are fully committed to the open source OpenTelemetry project and are actively engaged with the OTel community. Many Grafanistas spend a large proportion of their time contributing directly to OpenTelemetry upstream projects, helping make observability more powerful, reliable, and accessible for everyone as part of our big tent philosophy.

Troubleshooting Microservices with OpenTelemetry Distributed Tracing

Distributed tracing doesn’t just show you what happened. It shows you why things broke. While logs tell you a service returned a 500 error and metrics show latency spiked, only traces reveal the full chain of causation: the upstream timeout that triggered a retry storm, the N+1 query pattern that saturated your connection pool, or the missing cache hit that turned a 50ms call into a 3-second database roundtrip.

The evolution of OpenTelemetry: A deep dive with co-founder Ted Young

Sometimes the biggest challenges in software aren’t about code — they’re about consensus. What do we call things? What do we standardize? And how do you evolve a system that thousands of companies depend on without breaking everything along the way?

OpenTelemetry in Production: Design for Order, High Signal, Low Noise, and Survival

A lot of talk around OpenTelemetry has to do with instrumentation, especially auto-instrumentation, about OTel being vendor neutral, being open and a defacto standard. But how you use the final output of OTel is what makes business difference. In other words, how do you use it to make your life as an SRE/DevOps/biz person easier? How do you have to set things up to truly solve production issues faster?