Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Distributed Tracing and related technologies.

Open Standards Observability - Prometheus & OpenTelemetry

Modern applications are distributed, ephemeral and built from a dozen moving parts. To keep them reliable, you need real visibility: not just “is the server up?”, but“how is this request behaving, right now, across every component it touches?”. The good news is that the observability world has converged on a handful of open standards — Prometheus for metrics, OpenTelemetry for telemetry, plus battle-tested protocols like StatsD and NRPE.

Grafana Tempo: The distributed tracing journey to 3.0 (June 2026 Community Call)

Our distributed tracing journey from the inception of Tempo to 3.0. Can't comment in the chat? You may need to create a channel. Grafana Cloud is the easiest way to get started with Grafana dashboards, metrics, logs, traces, and profiles.

If You Are Building a Startup from a Vibe-Coded App, Don't Skip This #devops #programming #ai

Everyone is vibe coding products right now. But most applications are missing one crucial thing: Observability. In this video, I talk about: You can literally start this weekend: If you are turning your vibe-coded app into a real startup, observability should not be an afterthought.

Running the OpenTelemetry Collector as a Lambda

The OpenTelemetry Collector is usually deployed as a long-running process: a sidecar, a DaemonSet, an EC2 instance, a docker container on my computer. It sits there listening for telemetry. That's fine when I want to send telemetry all day, but not when telemetry is rare. Like right now, when I have an agent defined on AgentCore, and it runs a few times a week maybe. Or my website that hardly sees any traffic. Can I run the OpenTelemetry Collector as a Lambda function?

Errors, traces, logs, metrics: when to reach for what

When should I reach for a log, a trace, or a metric? I hit that question constantly when I instrument code, and I watch coding agents hit it too. It sounds like it should be obvious. Errors, traces, logs, and metrics are the four kinds of telemetry most apps run on, four tools in one box, and they overlap enough that the honest answer is every developer’s favourite: it depends. You can stuff context into span attributes instead of logging it. You can count log events instead of emitting a metric.

Your AI App Is Lying to You - Here's How to Fix That #devops #observability #programming

You shipped your AI app. But do you have all the answers? Do you actually know which model ran, how many tokens it consumed, or why it stopped? This is what LLM observability gives you, and most AI engineers are skipping it entirely. I built an SOS detection app and used OpenTelemetry to get full visibility into every single call. Token usage, model version, finish reason, and cost per call all in one place, standardised across any provider. Check out the OpenTelemetry GenAI docs in the link below; there is a lot more you can track than you think.

Observability Summit NA 2026: What the Community Is Thinking About

Two days in Minneapolis with the OpenTelemetry community, talking about where telemetry pipelines are headed and what the AI wave is doing to them. Two topics dominated everything: AI and cost reduction. Not as separate conversations, either. The more the community talked about AI telemetry, the more the cost question followed right behind it. I joined Diana Todea from VictoriaMetrics and Antonio Jimenez Martinez from Cisco ThousandEyes on the Telemetry That Matters panel.

How to Install and Configure an OpenTelemetry Collector

Originally published June 2024. Updated May 2026. A lot has changed since the first version of this guide. In May 2026, OpenTelemetry officially graduated within the CNCF, the highest maturity level a project can achieve. All three core signals (metrics, logs, and traces) are now stable across every major language SDK. Collector adoption has never been higher, and the ecosystem around it, particularly OpAMP for remote management, has matured significantly. This update walks through three things.

You don't need to pick one: how Sentry and OpenTelemetry work together

You already instrumented the backend with OpenTelemetry. Your services emit spans. Your teams know the OTel APIs. Maybe you already run a Collector. So when you start evaluating Sentry, the obvious question is: Do you need to replace your OpenTelemetry setup with the Sentry SDK? No. The practical answer is usually: keep OpenTelemetry where it already works, add the Sentry SDK where it gives you more application context, and send OpenTelemetry Protocol (OTLP) events to Sentry.