Operations | Monitoring | ITSM | DevOps | Cloud

Tracing

The latest News and Information on Distributed Tracing and related technologies.

How to maximize span ingestion while limiting writes per second to a Scylla backend with Jaeger tracing

Jaeger primarily supports two backends: Cassandra and Elasticsearch. Here at Grafana Labs we use Scylla, an open source Cassandra-compatible backend. In this post we’ll look at how we run Scylla at scale and share some techniques to reduce load while ingesting even more spans. We’ll also share some internal metrics about Jaeger load and Scylla backend performance. Special thanks to the Scylla team for spending some time with us to talk about performance and configuration!

Distributed tracing analysis backend that fits your needs

I am spending a considerable amount of time recently on distributed tracing topics. In my previous blog, I discussed different pros and cons of various approaches to collecting distributed tracing data. Right now I would like to draw your attention to the analysis back-end: what does it take to be good at analyzing transaction traces?

Using Observability as a Proxy for Customer Happiness

Today, users and customers are driven by response rates to their online requests. It’s no longer good enough to just have a request run to completion, it also has to fit within the perceived limits of “fast enough”. Yet, as we continue to build cloud-native applications with microservice architectures, driven by container orchestration like Kubernetes in public clouds, we need to understand the behavior of our system across all aspects, not just one.

OpenTelemetry, Open Collaboration

OpenTelemetry — the merger of OpenCensus and OpenTracing — appeared in May of 2019, led by companies like Omnition (now a part of Splunk), Google, Microsoft, and others who are pushing the curve on observability. OpenTelemetry is a project within the Cloud Native Computing Foundation (CNCF) that has gathered contributors and supporters far and wide, becoming one of the most active projects found in open source today. It’s currently #2 behind only Kubernetes!

Where did all my spans go? A guide to diagnosing dropped spans in Jaeger distributed tracing

Nothing is more frustrating than feeling like you’ve finally found the perfect trace only to see that you’re missing critical spans. In fact, a common question for new users and operators of Jaeger, the popular distributed tracing system, is: “Where did all my spans go?” In this post we’ll discuss how to diagnose and correct lost spans in each element of the Jaeger span ingestion pipeline.

Distributed Tracing & Logging - Better Together

Monitoring requires a multi-faceted approach if DevOps teams want end-to-end visibility and deep insight into issues. This is especially true in the case of modern microservices applications, which are essentially collections of distributed services that talk to each other over a service mesh. With monolithic applications, requests can be tracked easily from the client to the server and back, but with modern applications, every request passes through numerous services before completion.

Distributed Tracing Tools and New Industry Standards

Metrics and logs have been around for a long time, yet we haven’t adopted common standards for them. Sure, there have been attempts on the metric side with OpenMetrics. Similarly, tracing only got a standardization effort with OpenTracing just a few years ago. There was no effort in a unified approach to standardize all observability signals until OpenTelemetry began a little less than two years ago. And there has been a need.

Dogfooding Chronicles: Thinking Backward, Moving Forward

Zac Propersi, Engineering Manager at Sentry, can tell when a page is not loading as fast as it should — just by looking at it. While working on our new Metric Alerts feature, Zac noticed that the alerts pages were rendering slowly. Being the super Sentry user that he is, he wrote a custom query in Discover to see just how slow the transactions were.