Operations | Monitoring | ITSM | DevOps | Cloud

How Dropbox rebuilt its logging stack with Grafana Loki after a data center went dark

Two years ago, a power outage knocked a Dropbox data center offline. It wasn’t just any data center. It was the only one where Dropbox hosted Grafana Loki, meaning engineers couldn’t access their log data. “We had considered a data center outage when we were rolling out Loki, but it had just never risen up in priority enough to get put into multiple data centers,” said Chris Hodges, an infrastructure software engineer at the cloud storage company.

How Cursor scaled infrastructure rapidly and reliably using Datadog

At Datadog, we use Cursor to empower our teams to build more quickly. And we know that building and troubleshooting with AI tools like Cursor is done best with the right observability data and context. Discover how Cursor was able to rapidly and reliably scale their infrastructure 100x using Datadog to meet the needs of a fast growing user base. And learn more about how we’re bring Datadog tools and context to your favorite AI IDEs and agents with our MCP Server and extensions.

How ManageEngine helped Infogain to simplify its IT operations

Anurag Chaturvedi from Infogain India shares how ManageEngine transformed the company's IT operations with improved automation and an enhanced user experience over the past 10 years. With over a decade of use, ManageEngine remains a trusted ally in Infogain’s journey towards efficient IT.

Cutting SIEM Costs in Half: How BILL Modernized Their SOC with Observo AI

When we talk to security leaders, the theme is almost always the same: “How do we keep up with explosive telemetry growth without blowing our budget—or compromising visibility?” That’s exactly what BILL, a leader in financial operations software, was grappling with.

Inside the Wins: Real Stories of Transforming Azure Observability into Business Value

Azure environments are growing fast, and so are the challenges of monitoring them at scale. In this blog, part of our Azure Monitoring series, we look at how real ITOps and CloudOps teams are moving beyond Azure Monitor to achieve hybrid visibility, faster troubleshooting, and better business outcomes. These real-life customer stories show what’s possible when observability becomes operational. Want the full picture? Explore the rest of the series.

Gett replaces paging tool with Exigence to achieve IR excellence

“By the time a pager alerts you to a problem, it’s too late to think about how to manage the incident.”(Google SRE Workbook) Gett, a global leader in urban mobility and corporate travel tech, knew that relying on its incumbent paging system and siloed manual processes for incident management was no longer sustainable. Any delay in response and service restoration could jeopardize customer satisfaction and business continuity.

How One Enterprise Reduced 1,600 Trap Alerts by 80% and Saved 26 Hours During Migration

For large-scale IT organizations, SNMP traps and log alerts are critical, but they can also be a hidden source of technical debt. Over time, alerting systems balloon with noise like redundant conditions, alerts from decommissioned tools, and logic that no longer maps to today’s hybrid infrastructure.

How a global bank turned a search engine into its data backbone

BBVA transformed customer experience and operational insight by using Elastic to unify 45B+ data points across 50+ banking services, with sub-second response times. When BBVA's David Jiménez Ausin looks back at 2014, he sees a very different banking landscape. “Almost everything was still via web channel, as the app wasn't as developed as it is now, and each service had its information in its own systems,” he recalls.

How a cooking platform whipped up a new observability plan with Grafana Cloud

As any good cook knows, if you want to create a top-notch dish, you have to use the best ingredients. So when the engineering team for Cookidoo — an online platform and app that features more than 80,000 guided recipes for the Thermomix, an all-in-one kitchen small appliance — realized the observability tool they were using to monitor the platform wasn’t delivering what they needed, they decided to switch to Grafana Cloud and OpenTelemetry.

How SpotOn overhauled its observability strategy with standardized tagging and Grafana Cloud

Many engineers would agree: migrating to a new observability platform can be a serious undertaking. But it’s also the perfect opportunity to step back, revisit some of the foundational practices that drive your observability strategy — and reap some major benefits, as a result. This was the case at SpotOn, a provider of restaurant point of sales systems and business software, which recently migrated from four disparate observability tools and consolidated on Grafana Cloud.