Operations | Monitoring | ITSM | DevOps | Cloud

OTel Updates: OpenTelemetry Proposes Changes to Stability, Releases, and Semantic Conventions

Over the past year, the Governance Committee ran user interviews and surveys with organizations deploying OpenTelemetry at scale. A few patterns came up consistently: Stability levels aren't always obvious. When you install an OTel distribution, some components might be experimental or alpha without clear markers. This makes it harder to evaluate what's production-ready. Instrumentation libraries sometimes wait on semantic conventions.

Using AI + Rollbar's Session Replay to Understand Complex Errors

Front‑end bugs are notoriously hard to reproduce. By the time an error shows up in your monitoring tool, the most important context is already gone: what the user actually did. Session replay helps—but only if someone has the time and patience to scrub through recordings, correlate events, and form a hypothesis. That’s where Rollbar’s MCP server, paired with an AI agent like Github Copilot, changes the game.

Fresh from AWS re:Invent: Supercharging HAProxy Community with AWS-LC Performance Packages

The timing couldn’t have been better. Last week, the tech world descended on Las Vegas for AWS re:Invent. It was the perfect venue to talk about cloud infrastructure, scale, and the future of application delivery. While we enjoyed talking shop at our booth, we didn't just bring swag and demos; we brought a significant performance improvement for our open-source community.

The Impact of Network Downtime on Enterprise Productivity - and How Monitoring Helps

Enterprise IT teams operate under relentless pressure to maintain seamless connectivity, yet many business leaders underestimate the financial gravity of Network Downtime. Studies consistently show that even a brief outage can cost enterprises hundreds of thousands of dollars per hour, positioning downtime as one of the most disruptive threats to business continuity.

Major Cloud Outages of 2025

Cloud outages in 2025 ranged from minor ones affecting some sections of users, to major ones affecting hundreds or thousands of users. Services like Cloudflare and AWS on which many other services depend experienced outages that affected many due to the cascading effect. Let's look at some of the major cloud outages in 2025.

How to use AI to analyze and visualize CAN data with Grafana Assistant

Note: A version of this post originally appeared on the CSS Electronics blog. Martin Falch, co-owner and head of sales and marketing at CSS Electronics, is an expert on CAN bus data. Martin works closely with end users, typically OEM engineers, across diverse industries, including automotive, maritime, and industrial. He is passionate about data visualization and AI—and he’s been working extensively with Grafana Assistant.

How to use Gremlin's Reliability Report

Modern applications can easily include hundreds of discrete services, all of which need to be reliable in order for the application to function correctly. While running tests on a handful of critical services can lead to small reliability improvements, real impact requires testing and increased reliability visibility across your entire organization. That’s the logic behind the new, improved Reliability Reports within Gremlin.

AI Reliability, Part 2: When the Datacenter Becomes the Bottleneck

In Part 1, we talked about all the hidden complexity inside AI systems: the pipelines, GPUs, embeddings, vector databases, orchestration layers, and everything else that quietly determines how reliable an AI-first product really is. But all of that software still rests on something far less glamorous: the physical infrastructure underneath it.

Elastic and Microsoft partnership achievements in 2025

Highlights of another successful year of customer-centric collaboration Once again, our partnership delivered an impressive year of innovation with Microsoft Azure, Azure AI Foundry, and Azure OpenAI. This blog highlights our continued collaboration with Microsoft to better serve customers throughout 2025 and our key moments at Microsoft Ignite.