Operations | Monitoring | ITSM | DevOps | Cloud

How to Measure the Business Impact of Digital Employee Experience (DEX)

Not long ago, digital employee experience (DEX) was just a line on an IT report — something to track uptime, device issues and help desk tickets. Those metrics matter to IT, but they don’t always resonate with the C-suite. But behind the numbers is a larger story: every slowdown, every frustrating login, every delayed ticket chips away at productivity, engagement and business results.

Enhanced Icinga 2 Container Images

As some of you might have already noticed, we recently gave our official Icinga 2 container image builds a complete overhaul. These new images are currently available only as snapshot builds but will replace the existing stable images with the next Icinga 2 v2.16.0 release. In this blog post, we’ll walk you through the key changes and improvements that come with the new images, as well as the reasons behind these changes.

What to Do When Legacy Systems Threaten Your Digital Transformation Goals

You can’t blame them. Despite all their shortcomings, legacy systems are so tightly woven into vital operations across industries that eliminating them is far from straightforward. The real problem is not the systems themselves, but their low sense of adaptive capacity. Legacy architectures can’t anticipate what the next disruption will look like.

GitKraken Desktop 11.5: We Fixed What Mattered Most

GitKraken Desktop 11.5 delivers massive performance improvements where they count most, opening repos up to 5x faster, stash refreshes 100x faster, and branch/tag loading 100x faster. No workflow changes required. Just measurably faster Git operations that give you back your time and flow. Ready to see it in action? Check out the Youtube Tutorial below. We need to talk about something that’s been frustrating many of you: performance.

Why Cloud Managed Data Center Services Are Having A Moment

The obituary for the data center was written too soon. While the cloud dominates today’s IT headlines, traditional infrastructure hasn’t disappeared. It is evolving. Enterprises still rely on data centers for control, compliance, and reliability. However, they are increasingly needing the agility, scalability, and cost visibility that the cloud promises. Cloud managed data center services are bridging this gap.

Top 9 LLM Observability Tools in 2025

Organizations are adding GenAI to their current and future architectures and product roadmaps, requiring Ops teams to ensure LLMs are accurate, fast, secure and cost-efficient. LLM observability tools directly addresses these needs, helping identify and prevent common LLM errors and issues: LLM observability provides the telemetry data for this analysis. LLM observability tools trace requests end-to-end, evaluate outputs, and correlate quality with latency, cost, prompts, tools, and data sources.

Vibe Coding: Closing The Feedback Loop With Traceability

I have begun to truly embrace vibe coding over the last few months, using Cursor as my main code editor and Claude Sonnet 4 for my agent's LLM. It's an exciting time as a developer, we get to experiment with something that promises to 100x our productivity while pioneering the new workflows and strategies for implementing these tools. But, as most people who have done any extensive development with LLMs in a sufficiently sized code base knows, it's a bit like trying to herd scared cats.

ObservabilityCON 2025: A guide to all the announcements from Grafana Labs

Today at ObservabilityCON 2025 in London, we unveiled a number of exciting announcements and updates to Grafana Cloud that reimagine SaaS economics, simplify the complexity of running your observability stack at scale, and provide AI tooling that’s actually useful. (Root cause analysis via chatbot? Yes, please!) Check out the keynote to learn more about how we’re helping you do more with the open observability cloud, and read on for a quick recap of all the news from ObservabilityCON 2025.

AI-powered observability: Resolve incidents faster, reduce alert fatigue, and expand access

When an incident lands in your lap, you’ll often start with a lot of questions: Why is latency so high? What’s causing this outage? How much money are we losing at this very moment? The uncertainty—and the pressure to quickly find answers—has always been one of the more nerve wracking parts of being an on-call engineer, but it doesn’t have to be that way any more.

Maximize data value and cut costs: Adaptive Telemetry for metrics, logs, traces, and profiles in Grafana Cloud

When it comes to observability, more data doesn’t always mean more clarity. In fact, as telemetry volumes grow, it only becomes more difficult to discern the signals from the noise and to keep overall costs in check. This is exactly why we built Adaptive Telemetry, a suite of features in Grafana Cloud that analyzes how your telemetry is used and then automatically recommends actions like aggregating, sampling, dropping, or reducing low-value data.