Operations | Monitoring | ITSM | DevOps | Cloud

Mute timing vs. silences in Grafana Alerting: How to pick the best fit for your use case

Have you ever been in a situation where know your team is going to run their weekly maintenance window and you silence your notifications to prevent a flood of false positives from pinging your inbox? If you are associated with a team that uses any type of alert system, you know how easily alert fatigue can happen. The incessant and unpredictable (or even, at times, predictable) pings, emails, and notification alerts can drive even the most serene worker totally batty.

How to perform real-time DNS monitoring in Grafana Cloud

When DNS or domain name server resolution processes fail, or become sluggish, users can experience timeouts, connection errors, and degraded performance — often without clear indication of the root cause. This is where DNS checks in Grafana Cloud Synthetic Monitoring come in, allowing you proactively monitor domain name resolution, verify that domains resolve to the correct IP address, and even measure how quickly that resolution occurs.

Single-tenant vs. multi-tenant architecture with Grafana Cloud: A guide to choosing the right approach

Grafana Cloud’s flexibility is one of its greatest strengths, but the breadth of choices can sometimes be overwhelming. We see this a lot when it comes to selecting the right architectural approach, with organizations unsure of how many stacks they need to host their environment. Grafana Cloud provides robust features for managing tenancy, enabling organizations to effectively handle diverse teams and projects.

Faster, more memory-efficient performance in Grafana Mimir: a closer look at Mimir Query Engine

Until recently, Grafana Mimir — our open source, horizontally scalable, multi-tenant time series database (TSDB) — has exclusively used Prometheus’ PromQL engine to evaluate queries. While the PromQL engine works great, it sometimes needs a lot of memory to run, specifically in the Mimir querier component. To address this memory consumption issue, we recently introduced Mimir Query Engine (MQE).

Distributed performance testing for Kubernetes environments: Grafana k6 Operator 1.0 is here

Performance testing is critical to build reliable applications, but testing at scale, especially inside modern Kubernetes environments, can be a challenge. For example, how do you coordinate tests across multiple nodes, test private services without compromising security, or even do both at once? And most importantly, how do you do all this without adding too much operational complexity to your stack?

How to connect ServiceNow to Grafana Cloud IRM incidents

Companies rely on a variety of services to streamline their workflows, which often requires data synchronization or information sharing across platforms. But are your tools flexible enough to connect with external systems? ServiceNow is widely recognized for its robust and complex workflow support for enterprises. However, it may not always offer the most intuitive or user-friendly experience when handling incidents.

Debug, query, and build faster with AI: How we use Grafana Assistant at Grafana Labs

We recently released Grafana Assistant into public preview for Grafana Cloud, and we’ve been excited to see how our customers have already made it part of their daily observability routines. At the same time, Assistant is becoming a go-to companion for developers right here at Grafana Labs, whether they’re debugging on-call issues, helping customers, or trying to remember tricky PromQL syntax.

A smarter filter for Grafana Alerting: Introducing a new way to find your alerts

At Grafana Labs, we believe that effective alerting is the cornerstone of any robust observability strategy. That’s why we’re constantly listening to your feedback and working to improve the Grafana user experience so it’s easier for you to manage and interact with your alert rules. Today, we’ve excited to tell you about an update in Grafana Alerting that’s built to address some of your biggest pain points.

Measuring service response time and latency: How to perform a TCP check in Grafana Cloud Synthetic Monitoring

When your database stops accepting connections or your mail server becomes unreachable during business hours, the impact is immediate and costly. Fortunately, the right monitoring strategy can help you detect these TCP connection failures early on, and prevent them from impacting the user experience.

Managing access in Grafana: a single stack journey with teams, roles, and real-world patterns

When multiple teams use Grafana, it can start to feel a bit messy. Dashboards pile up, permissions become unclear, and teams accidentally overwrite each other’s work. To help you and your organization stay clear, collaborative, and secure, we recommend putting all users in a single Grafana Cloud stack and managing access with teams, roles, and folders. To illustrate this, I’ll share a hypothetical example of how you can put this into practice across three teams. Let’s dive in!