Tracing

epsagon

Distributed Tracing vs. Logging: Comparison 101

Microservice Architecture introduces operational complexity when it comes to monitoring service-to-service communication and diagnosing performance issues. From an observability perspective, it is imperative to have in-depth visibility into your systems to ensure debugging is convenient and that you can recover from failure scenarios faster.

logz.io

Jaeger Essentials: Best Practices for Deploying Jaeger on Kubernetes in Production

Logs, metrics and traces are the three pillars of the Observability world. The distributed tracing world, in particular, has seen a lot of innovation in recent months, with OpenTelemetry standardization and with Jaeger open source project graduating from the CNCF incubation. According to the recent DevOps Pulse report, Jaeger is used by over 30% of those practicing distributed tracing.

kong

Tracing With Zipkin in Kong 2.1.0

There is a great number of logging plugins for Kong, which might be enough for your needs. However, they have certain limitations: Most of them only work on HTTP/HTTPS traffic. They make sense in an API gateway scenario, with a single Kong cluster proxying traffic between consumers and services. Each log line will generally correspond to a request which is “independent” from the rest.

jaegertracing

How to maximize span ingestion while limiting writes per second to Scylla with Jaeger

Jaeger primarily supports two backends: Cassandra and Elasticsearch. Here at Grafana Labs we use Scylla, an open source Cassandra-compatible backend. In this post we’ll look at how we run Scylla at scale and share some techniques to reduce load while ingesting even more spans. We’ll also share some internal metrics about Jaeger load and Scylla backend performance. Special thanks to the Scylla team for spending some time with us to talk about performance and configuration!

jaegertracing

Where did all my spans go? A guide to diagnosing dropped spans in Jaeger

Nothing is more frustrating than feeling like you’ve finally found the perfect trace only to see that you’re missing critical spans. In fact, a common question for new users and operators of Jaeger, the popular distributed tracing system, is: “Where did all my spans go?” In this post we’ll discuss how to diagnose and correct lost spans in each element of the Jaeger ingestion pipeline.

datadog

Instrument your Python applications with Datadog and OpenTelemetry

If you are familiar with OpenTracing and OpenCensus, then you have probably already heard of the OpenTelemetry project. OpenTelemetry merges the OpenTracing and OpenCensus projects to provide a standard collection of APIs, libraries, and other tools to capture distributed request traces and metrics from applications and easily export them to third-party monitoring platforms.

grafana

How to maximize span ingestion while limiting writes per second to a Scylla backend with Jaeger tracing

Jaeger primarily supports two backends: Cassandra and Elasticsearch. Here at Grafana Labs we use Scylla, an open source Cassandra-compatible backend. In this post we’ll look at how we run Scylla at scale and share some techniques to reduce load while ingesting even more spans. We’ll also share some internal metrics about Jaeger load and Scylla backend performance. Special thanks to the Scylla team for spending some time with us to talk about performance and configuration!

sumologic

Distributed tracing analysis backend that fits your needs

I am spending a considerable amount of time recently on distributed tracing topics. In my previous blog, I discussed different pros and cons of various approaches to collecting distributed tracing data. Right now I would like to draw your attention to the analysis back-end: what does it take to be good at analyzing transaction traces?

lightstep

How Distributed Tracing Makes Logs More Valuable

A few months ago, I sat down to chat with a platform team leader from one of Lightstep’s customers. We were talking about observability (big surprise!), and he made a fascinating remark: logging has become a “selfish” technology. So, what did he mean by this?? What is logging withholding with its selfishness? And how can we address this with a more thoughtful and unified approach to observability?