Operations | Monitoring | ITSM | DevOps | Cloud

Introducing our improved uptime check

The past few months, we’ve working on improving our uptime check. We proud to announce that this improved check is now available for all users. You don’t have to do anything to get it (unless you are not subscribed to Oh Dear, in that case your should subscribe to Oh Dear ), all our users now have it enabled by default. In this blogpost, I’d like to give an overview of the changes and some background why we changed some things.

How to Monitor and Manage Grafana Memory

It’s late, you get an alert, and Grafana is down. The reason? It ran out of memory. If you’ve ever watched Grafana slowly eat up RAM until it just stops responding, you know how frustrating that can be. Memory can spike quickly, especially with complex dashboards and multiple data sources. This guide will help you understand what’s going on and how to keep Grafana running without surprises.

Graylog vs ELK: Which Log Management Solution Fits Your Stack?

Your app logs start simple—maybe a few print() or logging.info() calls. But in production, things get noisy. Thousands of log lines per minute, scattered across services, and it’s hard to know what matters. This is when tools like Graylog and the ELK stack help. They let you collect, search, and make sense of logs, but they do it in different ways. This guide breaks down how each one handles setup, scale, and day-to-day use.

Peacetime Observability: Spotting Risks Before They Become Incidents

Most of the time, nothing’s broken. Traffic’s flowing, alerts are quiet, and everything seems fine. That’s peacetime, when no one’s getting paged. Coroot helps in both peacetime and wartime. When things go wrong, it guides you to the root cause fast. But during peacetime, it helps you spot risks early, clean up inefficiencies, and prevent those incidents from happening in the first place.

Identifying Idle Paths in a Data Center Leaf-Spine Fabric

In leaf-spine data center networks, traffic often becomes imbalanced, leaving some uplinks idle and resulting in wasted bandwidth. Kentik helps engineers identify underutilized paths, diagnose the causes, and take corrective action using enriched telemetry, visual topology maps, and intelligent alerts, turning hidden inefficiencies into actionable insights.

Is AI already replacing me? Insights from Civo Navigate

With all the rapid advancements in machine learning and AI, it can feel like we’re constantly playing catch-up. Over the last two Civo Navigate conferences, Berlin 2024 and San Francisco 2025, Civo brought together leading experts to discuss the future of AI, machine learning, and the growing challenges and opportunities for developers and businesses.

How to Fix Latency Spikes in WAN and LAN Networks

Even a few seconds of delay in your network can be the difference between closing a deal on a video call, or watching it buffer into oblivion. These delays, known as latency spikes, are unpredictable surges in the time it takes for data to travel across your network. Whether you're running a cloud-based CRM, managing VoIP calls across offices, or supporting remote teams on Microsoft Teams or Zoom, latency spikes can disrupt productivity, hinder performance, and lead to a flood of support tickets.