Operations | Monitoring | ITSM | DevOps | Cloud

November 2022

Tracing with InfluxDB IOx

Tracing has always been a key use case for time series data. But admittedly, it’s also one that past versions of InfluxDB could not handle as well as we wanted. One of the roadblocks was the cardinality issue. Tracing data is, almost by definition, high cardinality data and prior to InfluxDB IOx, high cardinality data could affect query performance.

TraceQL: a first-of-its-kind query language to accelerate trace analysis in Tempo 2.0

The much-anticipated release of Grafana Tempo 2.0, which we previewed at ObservabilityCON 2022, will represent a huge step forward for the distributed tracing backend. Among the biggest highlights will be TraceQL, a first-of-its-kind query language that makes it easier than ever to find the exact trace you’re looking for. There’s supposed to be a video here, but for some reason there isn’t. Either we entered the id wrong (oops!), or Vimeo is down.

Seeing vs. Understanding - The Power of Trace Visualization

It’s common in our everyday language to conflate seeing and understanding when the two are actually very different things. For example, if every day for the last few years we spoke briefly and wrote down the total number of Covid cases in the world, it would be easy to see some trends in the data—you would see the data. But if we present the same data drawn as a chart, it’s easy to understand where the spikes and dips are and when the situation got really bad.

AppSignal for Node.js 3.0 Introduces OpenTelemetry Support

After a period of beta testing, we're happy to announce the launch of our latest AppSignal for Node.js package. This package features six new integrations and uses the OpenTelemetry framework for reliable telemetry data collection. OpenTelemetry is an open standard that facilitates the instrumentation of standardized telemetry data collection. AppSignal is committed to using OpenTelemetry in new integrations, and our Node.js integration is the first to use the standard.

Unified Observability: The Role of Metrics, Logs, and Traces

There is significant momentum around observability, as detailed in VMware’s 2022 State of Observability report, with almost all respondents stating that observability would benefit their organization. This is further validated by Gartner including observability in their Magic Quadrant for Application Performance Monitoring and Observability report for the first time this year.

RedHat OpenShift monitoring with Splunk's OpenTelemetry Operator

Do you have an instant view of all the full-stack automated operations in your OpenShift environment. Would you like to monitor your self-service provisioning as code, to better understand health and performance? Have you been struggling to resolve service issues and reduce the time taken for troubleshooting across all your Kubernetes deployment? We’ve got you covered!

Jaeger Tracing: Pros, Cons, Alternatives and Best Practices

OpenTelemetry (OTel), is an open source, CNCF (Cloud Native Computing Foundation) project that provides tools, APIs and SDKs for observability data collection (i.e, logs, metrics and traces) from cloud-native applications. Developers can use the data collected from OTel to monitor and analyze application health and performance. To leverage the data and its insights, you can export the data to external solutions, like APMs, open source Jaeger and Zipkin, Helios, and others.

AppSignal's Future with OpenTelemetry

AppSignal is a strong supporter of open-source technology. We owe so much of our modern world to the unseen, hard-working software developers who build and maintain the many technologies that make everything from reading this article to sending a message from your phone possible. That's why we're investing in OpenTelemetry, the open-source standard for telemetry data collection, rather than developing our own independent standard.

Independence with OpenTelemetry on Elastic

The drive for faster, more scalable services is on the rise. Our day-to-day lives depend on apps, from a food delivery app to have your favorite meal delivered, to your banking app to manage your accounts, to even apps to schedule doctor’s appointments. These apps need to be able to grow from not only a features standpoint but also in terms of user capacity. The scale and need for global reach drives increasing complexity for these high-demand cloud applications.

Reduce Data Costs: Log Sampling with OpenTelemetry and BindPlane OP

Redundant logs are a common nuisance in observability pipelines of all kinds. In large environments, excess logs can multiply data costs to unsustainable amounts. Log sampling is the process of randomly sampling logs to produce the same valuable insight with dramatically reduced data flow. Configuring agents in a pipeline to appropriately sample logs can be a pain. Pipeline managers, like BindPlane OP, make that process simple and scalable.

5 Reasons Why OpenTelemetry is the Future of Observability

It has been said that open source is eating the world and in the observability space, the project behind this movement is OpenTelemetry. The project is quickly becoming the standard for instrumentation and collection of observability data. Why is an open standard and open-source approach to instrumentation and data collection so compelling? This talk will provide five reasons why OpenTelemetry is disrupting the observability market.

Replaying flows and troubleshooting issues in mobile app development using OpenTelemetry

iOS and Android apps are often a common component of distributed applications, forming a key part of the software architecture. These mobile apps provide another way to access data and perform actions on various services, requiring tight integration between the apps and the components which serve the data and control it.

RedHat OpenShift monitoring with Splunk's OpenTelemetry Operator

Do you have an instant view of all the full-stack automated operations in your OpenShift environment. Would you like to monitor your self-service provisioning as code, to better understand health and performance? Have you been struggling to resolve service issues and reduce the time taken for troubleshooting across all your Kubernetes deployment? We’ve got you covered!

How to Monitor SNMP with OpenTelemetry

With observIQ’s latest contributions to OpenTelemetry, you can now use free open source tools to easily aggregate data across your entire infrastructure to any or multiple analysis tools. The easiest way to use the latest OpenTelemetry tools is with observIQ’s distribution of the OpenTelemetry collector. You can find it here.

Grafana Agent 0.29.0 release: New OpenTelemetry components

Today the Grafana Agent team is excited to announce the release of Grafana Agent v0.29.0. This September, we introduced a new way to easily run and configure Grafana Agent called Grafana Agent Flow, our new dynamic configuration runtime built on components. Within Flow, we are also embracing Grafana Labs’ big tent philosophy by introducing OpenTelemetry (OTel) Collector components and converters for traces, metrics, and logs in Agent v0.29.0.

"Managing OpenTelemetry Through the OpAMP Protocol" by Mike Kelly, observIQ

Managing thousands of data collection Agents across just as many servers can overwhelm DevOps teams. Open Agent Management Protocol (OpAMP) is a new network protocol from the OpenTelemetry Project that enables remote management of OpenTelemetry collectors, allowing them to report their status to and receive configuration from a Server and to receive agent package updates from the server. This eliminates the need to create new custom distributions and redeploy, drastically simplifying Agent management.

Sumo Logic's investment in OTel

When teams collect data without full observability of what others on the team can see, it becomes clear that no one’s picture is truly accurate. In this picture, all of the people are wearing blindfolds and feeling around to see what is in front of them. One thinks this creature is a spear, another thinks it is a tree trunk, and another a rope. As long as they cannot observe what the others can, there is poor data fidelity.

Deploying OpenTelemetry Organizationally: From Proof of Concept to In-Production at Scale

Observability involves telling a coherent story about an entire system. Over the years, video streaming service Pluto TV has had to navigate many storytellers in terms of observability vendors, tools, and formats before settling on OpenTelemetry to analyze and compare features across its many destination platforms. During this presentation, you'll see how Bharathi Ramachandran—Engineering Manager at Pluto TV—used OpenTelemetry to implement his initial proof of concept and get his entire organization shipping observability data at scale.

What Can OpenTelemetry Distributed Tracing Architecture Do for Frontend Developers?

When developers talk about the options OpenTelemetry opens up to them, one of the most powerful use cases is troubleshooting distributed architectures. With OTel data and insights, developers can identify bugs and solve a wide range of issues across various types of architecture and flows. These include asynchronous flows, flows with Lambda functions, and many more.

Monitoring Cloud Database Costs with OpenTelemetry and Honeycomb

In the last few years, the usage of databases that charge by request, query, or insert—rather than by provisioned compute infrastructure (e.g., CPU, RAM, etc.)—has grown significantly. They’re popular for a lot of the same reasons that serverless compute functions are, as the cost will scale with your usage. No one is using your site? No problem: you’re not charged.

Opentelemetry vs. Prometheus

OpenTelemetry and Prometheus are classified as monitoring tools, but they also have significant differences that your company should know about. For cloud-native applications, OpenTelemetry is the future of instrumentation. It’s the first critical step that allows companies to monitor and improve application performance. OpenTelemetry also supports multiple programming languages and technologies.

How to correlate performance testing and distributed tracing to proactively improve reliability

At ObservabilityCON, we announced our first step towards launching a native integration between Grafana k6 load testing and Grafana Tempo tracing (k6 x Tempo) in Grafana Cloud. We created k6 x Tempo to help dev, testing, and operation teams analyze their performance test results more effectively and proactively improve the reliability of their business-critical applications.

OpenTelemetry, Auto-Instrumentation and Splunk Observability Cloud: A Jump Start

Have you been meaning to learn about OpenTelemetry and the integration of all available application and service telemetry? If you like to learn things by doing; get ready to dive in and have some fun with OpenTelemetry and Splunk Observability Cloud. Quickly learn more about OpenTelemetry auto-instrumentation and collectors at your own pace with these walkthroughs and guides.

Distributed tracing for Azure - Spot failures in the message flow

Serverless360 is a cloud management platform engineered for Microsoft Azure that brings enterprise-grade monitoring, tracing, remediation & governance under one roof. Everything you need to empower your Azure operations teams with more meaningful features and deliver effortless support.

Distributed Tracing: Build vs. Buy

With serverless and containerized applications becoming a norm, workloads and integrations are spread across multiple cloud environments. As these apps become increasingly more distributed, monitoring also becomes more complicated with siloed and incomplete telemetry. This is where distributed tracing brings great value. It enables end-to-end visibility in your modern and complex application.

What is Jaeger Distributed Tracing?

Distributed tracing is the ability to follow a request through a software system from beginning to end. While that may sound trivial, a single request can easily spawn multiple child requests to different microservices with modern distributed architectures. These, in turn, trigger further sub-requests, resulting in a complex web of transactions to service a single originating request.

Visualizing GraphQL Traces in Microservices

One of the things that most excites me about what we at Helios are doing differently than anyone else is trace visualizations. While there are many ways to troubleshoot microservice architectures, a good visual overview goes a really long way to speeding up understanding and therefore accelerating time to a resolution. When your manager asks, “Why did that break down?” with Helios you can answer quickly with accurate data—this is the value of the Helios platform.