Operations | Monitoring | ITSM | DevOps | Cloud

Baking in site reliability with observability and AI: How SpotOn uses Grafana Assistant to keep restaurants running

When you operate a restaurant, the last thing you want to do is shut your doors and turn away guests and staff because of some technology failure. And if you’re the one providing that tech, it’s your job to make sure that doesn’t happen. “For us, observability is about a lot more than just dashboards and alerts.

APM vs Observability: Both-and, not either-or

I'll start this, the third and final entry in my series on APM and Observability, which was originally inspired by my contribution to an APMdigest article, by once again pointing out that APM tools can be built with observability in mind. Many are, in fact. And the ones that aren’t don’t turn into a different type of tool. In my experience, it's more that there's a difference of mindset.

Observability in Fraud Detection: How Transaction Monitoring Tools Can Help Spot Money Laundering

In today's increasingly digital financial landscape, transaction monitoring has become a critical component of global fraud detection strategies. As financial crimes evolve in complexity, institutions must strengthen their ability to detect anomalies and uncover suspicious activity before it causes damage. Observability, a concept long used in IT and data operations is now emerging as a powerful approach for improving visibility into complex financial transactions.

How We Saved 70% of CPU and 60% of Memory in Refinery's Go Code, No Rust Required

We've just released Refinery 3.0, a performance-focused update which significantly improves Refinery's CPU and memory efficiency. Refinery has a big job: it performs dynamic, consistent tail-based sampling that maintains proportions across key fields, adjusts to changes in throughput, and reports accurate sampling rates.

Application Observability Done Right: Best Practices & Tips

Companies invest millions of dollars in observability platforms, yet they often still struggle to get application monitoring right. This is because most organizations focus on the technology, while neglecting the business. In this article, we’ll show you how to combine business requirements with technological needs. As the CTO of Logz.io, these are based on my experience working with global companies on their application observability needs.

Big Week at Logz.io: Major Product Announcements Signal New Era of AI-First Observability

Four months ago, we announced our vision of AI-first observability. Today, we’re not just talking about the future, we’re shipping it. This week marks a significant milestone with several major product announcements that demonstrate our continued momentum as the industry’s leading AI-first observability platform.

AI-powered observability: Resolve incidents faster, reduce alert fatigue, and expand access

When an incident lands in your lap, you’ll often start with a lot of questions: Why is latency so high? What’s causing this outage? How much money are we losing at this very moment? The uncertainty—and the pressure to quickly find answers—has always been one of the more nerve wracking parts of being an on-call engineer, but it doesn’t have to be that way any more.

Top 9 LLM Observability Tools in 2025

Organizations are adding GenAI to their current and future architectures and product roadmaps, requiring Ops teams to ensure LLMs are accurate, fast, secure and cost-efficient. LLM observability tools directly addresses these needs, helping identify and prevent common LLM errors and issues: LLM observability provides the telemetry data for this analysis. LLM observability tools trace requests end-to-end, evaluate outputs, and correlate quality with latency, cost, prompts, tools, and data sources.

OpenTelemetry + ignio: The Foundation for Intelligent, Unified Observability

In the previous post, What is OpenTelemetry?, we went over the What, Why, and the How of OpenTelemetry. We also went over the telemetry data lifecycle (data generation à collection à storage à usage) and how telemetry data (MELT) could be put to use to troubleshoot a representative web application scenario.

Real Estate App Development for Ops & Product Teams: From MVP to Scale

In the competitive world of real estate technology, developing an app that can scale from a Minimum Viable Product (MVP) to a fully-fledged solution is crucial. For operations and product teams, this journey involves strategic planning and execution to ensure the app meets evolving market demands and user expectations.