Operations | Monitoring | ITSM | DevOps | Cloud

Benchmarking Grafana Enterprise Metrics for horizontally scaling Prometheus up to 500 million active series

Since we launched Grafana Enterprise Metrics (GEM), our self-hosted Prometheus service, last year, we’ve seen customers run it at great scale. We have clusters with more than 100 million metrics, and GEM’s new scalable compactor can handle an estimated 650 million active series. Still, we wanted to run performance tests that would more definitively show GEM’s horizontal scalability and allow us to get more accurate TCO estimates.

How PayIt, a secure cloud service provider for digital government, uses Grafana and Prometheus for observability at cloud native scale

A trip to the DMV — and a realization that there had to be a better, more modern way for the system to work — sparked the idea for PayIt, a secure cloud service provider for digital government that launched in 2013. The company’s mission is to help state, local, and government agencies reach their constituents better and more effectively, shifting the reliance from in-office payments to digital ones.

We've added first-class Windows support to Grafana Agent

The Grafana Agent team is happy to announce that Grafana Agent 0.14.0-rc2 includes improved Windows support. Up until now, running Grafana Agent — our tool for gathering metrics, logs, and traces — in Windows was difficult and not well supported for Windows best practices. In short, it was not a good Windows citizen. In the new release candidate, we’re making changes to improve the experience, based on feedback from GitHub issues, customer contacts, and our own experience.

Q&A with Grafana Labs CEO Raj Dutt about our licensing changes

When Grafana Labs CEO and co-founder Raj Dutt announced to the team that the company would be relicensing our core open source projects from Apache 2.0 to AGPLv3, he opened the floor for discussion and encouraged anyone who had further questions to reach out. We believe in honesty and transparency, so we collected hard questions from Grafanistas, and Raj answered them for this public Q&A. The time felt right. As I’ve said publicly before, I’ve been thinking about this topic for years.

Grafana, Loki, and Tempo will be relicensed to AGPLv3

Grafana Labs was founded in 2014 to build a sustainable business around the open source Grafana project, so that revenue from our commercial offerings could be re-invested in the technology and the community. Since then, we’ve expanded further in the open source world — creating Grafana Loki and Grafana Tempo and contributing heavily to projects such as Graphite, Prometheus, and Cortex — while building the Grafana Cloud and Grafana Enterprise Stack products for customers.

Introducing the new Open Distro for Elasticsearch plugin for Grafana, also available in Amazon Managed Service for Grafana

Back in December, Amazon Web Services (AWS) and Grafana Labs partnered to launch the Amazon Managed Service for Grafana in a preview to a limited set of customers. Amazon Managed Service for Grafana is a scalable managed offering that provides AWS customers a native way to run Grafana directly within AWS alongside all their other AWS services.

Easily monitor your Tencent Cloud services with the new Grafana plugin

Plugins make it easier for Grafana users to get faster time to value. With a few clicks, you can start tapping into the different data stores you and your business already leverage — and see them all in one place in your Grafana dashboard. I’m a huge fan of partner-developed plugins for a few reasons, with my favorite being subject matter expertise. Who better to develop your plugin than the team that knows the product inside out?

How to send traces to Grafana Cloud's Tempo service with OpenTelemetry Collector

As an open source company, we understand the value of open standards and interoperability. This holds true for Grafana Cloud and our managed Tempo service for traces, which is currently in beta. The Grafana Agent makes it easy to send traces to Grafana Cloud, but it is not required. In fact, Grafana Cloud’s Tempo service is exposed as a standards-compliant gRPC endpoint that conforms to the Open Telemetry TraceService with HTTP Basic authorization.

How to troubleshoot remote write issues in Prometheus

Prometheus’s remote write system has a lot of tunable knobs, and in the event of an issue, it can be unclear which ones to adjust. In this post, we’ll discuss some metrics that can help you diagnose remote write issues and decide which configuration parameters you may want to try changing. First, let’s discuss how remote write is implemented. In the past, remote write would duplicate samples coming into Prometheus via scrape.