Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Website content checks are now multi-region!

We’ve made an upgrade to the “Check for Content” feature in our website monitors! Previously, users could select just one location at a time from US West, US East, EU West, and AU East via radio buttons. Now you can monitor your content from multiple locations instead of just one. By choosing multiple regions you can ensure that transient network issues don’t cause your monitor to go down unexpectedly.

Query Language Not Required! Explore Apps Suite Demo (Logs, Metrics, Traces, Profiles) | Grafana

This talk dives into making observability more accessible with Grafana’s Explore apps suite. This new experience, which includes eliminates the need to write queries as you visualize and explore your data. Explore Metrics and Explore Logs (both GA), simplify navigating Prometheus and Loki data with an intuitive UI, eliminating the need to write queries in PromQL or LogQL. They come with improvements like better related metrics recommendations, OpenTelemetry logging support, and enhanced pattern detection.

Set Up Links Between Data Sources With the New Correlations Feature | Demo | Grafana 11.3

Correlations is a feature that allows Grafana users to set up links between their data sources. Previously, the link generated would only be from one query to another—meaning results from a query could only generate links to open a second Explore pane with other query results. With this feature, users can now link to third party web-based software based on their search results. The format follows the standard Grafana format for using variables. This is generally available in all editions of Grafana.

Why Quality Matters: A Conversation with NDepend

In this episode of Founder & Friends, John-Daniel Trask, co-founder and CEO of Raygun, sits down with Patrick Smacchia, Founder and CEO of NDepend, to share their stories and strategies for building excellent software. They discuss the intricacies of the.NET ecosystem, strategies for sustaining high-quality software, and the evolution of development tools. Gain insights into NDepend's methods for managing dependencies, refining code, and optimizing performance. This episode is essential for developers aspiring to advance their technical abilities and produce superior software.

Flaky tests: their hidden costs and how to address flaky behavior

Flaky tests are bad—this is a fact implicitly understood by developers, platform and DevOps engineers, and SREs alike. When tests flake (i.e., generate conflicting results across test runs, without any changes to the code or test), they can arbitrarily fail builds, requiring developers to re-run the test or the full pipeline. This process can take hours—especially for large or monolithic repositories—and slow down the software delivery cycle.

Beyond Their Intended Scope: Uzing into Russia

The first installment of our new blog series, Beyond Their Intended Scope, covers BGP mishaps that may have escaped the community’s attention but are worthy of analysis. In this post, we review a recent BGP leak that redirected internet traffic through Russia and Central Asia as a result of a path error leak by Uztelecom, the incumbent service provider of Uzbekistan.

Key Metrics to Monitor for a Healthy Kafka Cluster

Maintaining a healthy Kafka cluster is critical to ensuring your real-time data pipelines run smoothly. However, keeping your Kafka environment in tip-top shape isn’t just about setting it up and letting it run. Regular monitoring of key metrics is essential to catch issues before they escalate, optimize performance, and keep everything humming along smoothly. So, what should we be looking at when it comes to Kafka metrics? Let’s break down the most important ones and how to interpret them.