Grafana

grafana

How to troubleshoot remote write issues in Prometheus

Prometheus’s remote write system has a lot of tunable knobs, and in the event of an issue, it can be unclear which ones to adjust. In this post, we’ll discuss some metrics that can help you diagnose remote write issues and decide which configuration parameters you may want to try changing. First, let’s discuss how remote write is implemented. In the past, remote write would duplicate samples coming into Prometheus via scrape.

grafana

How we use metamonitoring Prometheus servers to monitor all other Prometheus servers at Grafana Labs

One of the big questions in monitoring can be summed up as: Who watches the watchers? If you rely on Prometheus for your monitoring, and your monitoring fails, how will you know? The answer is a concept known as metamonitoring. At Grafana Labs, a handful of geographically distributed metamonitoring Prometheus servers monitor all other Prometheus servers and each other cross-cluster, while their alerting chain is secured by a dead-man’s-switch-like mechanism.

grafana

Using Telegraf plugins to visualize industrial IoT data with the Grafana Cloud Hosted Prometheus service

One of the biggest challenges with data visualization for complicated software systems is getting quick access to the underlying data and connecting it to some form of cloud-hosted solution. Traditionally it has required quite a bit of middleware and upfront setup with additional tooling.

grafana

You should know about... these useful Prometheus alerting rules

Setting up Prometheus to scrape your targets for metrics is usually just one part of your larger observability strategy. The other piece in the equation is figuring out what you want your metrics to tell you and when and how often you should know about it. Thankfully, Prometheus makes it really easy for you to define alerting rules using PromQL, so you know when things are going north, south, or in no direction at all.

grafana

Intro to exemplars, which enable Grafana Tempo's distributed tracing at massive scale

Exemplars are a hot topic in observability recently, and for good reason. Similarly to how Prometheus disrupted the cost structure of storing metrics at scale beginning in 2012 and for real in 2015, and how Grafana Loki disrupted the cost structure of storing logs at scale in 2018, exemplars are doing the same to traces. To understand why, let’s look at both the history of observability in the cloud native ecosystem, and what optimizations exemplars enable.

grafana

Want to visualize software development insights with Grafana? With our new Jira Enterprise plugin, you can!

A very fun part of my job as a Solutions Engineer at Grafana Labs is getting to learn the ins and outs of a new feature or play with a plugin while it is still in development. So, when I heard murmurs that our latest Enterprise plugin would be an integration with Jira, I felt the forsaken call of the agile sirens luring me back to my days when I worked as a technical writer on a product team.

grafana

How we're graduating Grafana Agent experiments into the official Prometheus project

We’ve been experimenting with new ways to use and operate Prometheus over the past year. Every successful Grafana Agent experiment turns into an upstream contribution for the whole Prometheus community to benefit from. In this blog post, I go over the history of the Agent’s successful — and not so successful — experiments.

grafana

Grafana 7.5 released: Loki alerting and label browser for logs, next-generation pie chart, and more!

Grafana v7.5 has been released! This is the last stable release before we launch Grafana 8.0 at GrafanaCONline in June. Register for free now, so you won’t miss the great sessions we’re planning around all things Grafana. And if you’re doing something special with Grafana that you’d like to share with the community, the CFP for GrafanaCONline is open until 06:59 UTC on April 10! Now, back to 7.5.

grafana

2021: The year of Cortex for IoT?

My Grafana Labs colleague RichiH recently talked about why IoT and time series databases work so well together. It just so happens that we have a highly scalable time series database on hand. Let’s talk about that. My name is Goutham, and I am a maintainer for Cortex. I have been working on it for nearly three years out of the four-and-a-half years the project has existed. Cortex is built to serve as a scalable, long-term store for Prometheus.

grafana

How I fell in love with logs thanks to Grafana Loki

As part of my job as a Senior Solutions Engineer here at Grafana Labs, I tend to pretty easily find ways out of technical troubles. However, I was recently having some Wi-Fi issues at home and needed to do some troubleshooting. My experience changed my whole opinion on logs, and I wanted to share my story in hopes that I could open up some other people’s eyes as well. (I originally posted a version of this story on my personal blog in January.) First, some background info.