Operations | Monitoring | ITSM | DevOps | Cloud

Operational Intelligence - the new horizon of observability

Monitoring your systems isn't enough anymore. Neither is “asking questions about your system”. Operational Intelligence embraces observability to proactively deliver business insights, support decision-making, and accelerate innovation. It seems that as the observability market grows and more and more products come into the space, the meaning of the term observability itself becomes more and more nebulous.

Prometheus Gauges vs Counters: What to Use and When

Choosing the wrong metric type in Prometheus can lead to inaccurate dashboards, false positives in alerting, and missed indicators of system failure. Gauge metrics are intended for tracking values that can go up and down, such as memory usage, queue depth, or the number of active connections. Unlike counters, which only increment (or reset on restart), gauges reflect the current state of a resource at scrape time.

How Dropbox rebuilt its logging stack with Grafana Loki after a data center went dark

Two years ago, a power outage knocked a Dropbox data center offline. It wasn’t just any data center. It was the only one where Dropbox hosted Grafana Loki, meaning engineers couldn’t access their log data. “We had considered a data center outage when we were rolling out Loki, but it had just never risen up in priority enough to get put into multiple data centers,” said Chris Hodges, an infrastructure software engineer at the cloud storage company.

How to detect vulnerable GitHub Actions at scale with Zizmor

As we previously reported on April 26, 2025, we had a security incident via an insecure GitHub Action and we have since published a post-incident review. We have confirmed that there has been no code modification, unauthorized access to production systems, exposure of customer data, or access to personal information.

The Road to Loki 4.0 (Loki Community Call June 2025)

In this Loki Community Call, we welcome back Ed Welch, Principal Engineer on the Loki team. We will be discussing with Ed what is next for Loki as we push forward to Loki 4.0. If you are interested, learn more about potential architecture changes, storage formats, and an open discussion on where Ed and the Loki team would like to see the future of Loki, then make sure you join us live and have your questions answered!

Observability Across Asia-Pacific: What's Holding Teams Back? | 2025 Observability Survey Analysis

What’s holding back observability maturity in Asia-Pacific? Grafana Labs' cofounder Anthony Woods shares key takeaways from the largest global observability survey. Learn how SaaS, budget concerns, and org structure are shaping Asia-Pacific (APAC)'s future. Grafana Cloud is the easiest way to get started with Grafana dashboards, metrics, logs, and traces. Our forever-free tier includes access to 10k metrics, 50GB logs, 50GB traces and more.

Grafana Cloud updates: The latest features in Kubernetes Monitoring, Fleet Management, and more

We consistently roll out helpful updates and fun features in Grafana Cloud, our fully managed observability platform powered by the open source Grafana LGTM Stack ( Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics). In case you missed them, here’s our monthly round-up of the latest and greatest Grafana Cloud updates.

We did it! SquaredUp is now a B Corp

After four years. Hundreds of meetings and conversations. Countless forms and paperwork submissions… …We’ve done it. SquaredUp is finally, officially B Corp certified We can now say that we are part of a unique global movement – and we couldn’t be more proud, excited, and motivated for our journey ahead.

Grafana Cloud: Manage the AWS Observability app as code with Terraform

Imagine setting up your AWS configuration in Grafana Cloud by hand and clicking through menus. When you only have a few services, it’s not a big deal. But as you add more and more, keeping track of every little change becomes a headache. It’s easy to make mistakes, and before you know it, things can get out of sync and your monitoring becomes unreliable.