Operations | Monitoring | ITSM | DevOps | Cloud

Agentic AI: Powerful But Fragile-What You Need to Know

Just when you’d finally wrapped your head around AI, here comes its autonomous cousin, Agentic AI. Think of it as AI that doesn’t just assist, but acts. It makes decisions, handles tasks, and communicates with other systems on its own. While it’s revolutionizing supply chains and customer experiences, there’s a catch. These autonomous agents rely on a plethora of third-party services, and when one fails, everything stops.

Top five metrics to monitor in IIS Logs

When managing and troubleshooting IIS (Internet Information Services) web server performance, logs are a critical resource. They capture detailed information about every request and response so your team can detect issues quickly. Let’s walk through the main IIS log formats, explore a sample log file, and break down five key types of IIS metrics you should monitor.

How to Fix Latency Spikes in WAN and LAN Networks

Even a few seconds of delay in your network can be the difference between closing a deal on a video call, or watching it buffer into oblivion. These delays, known as latency spikes, are unpredictable surges in the time it takes for data to travel across your network. Whether you're running a cloud-based CRM, managing VoIP calls across offices, or supporting remote teams on Microsoft Teams or Zoom, latency spikes can disrupt productivity, hinder performance, and lead to a flood of support tickets.

Is AI already replacing me? Insights from Civo Navigate

With all the rapid advancements in machine learning and AI, it can feel like we’re constantly playing catch-up. Over the last two Civo Navigate conferences, Berlin 2024 and San Francisco 2025, Civo brought together leading experts to discuss the future of AI, machine learning, and the growing challenges and opportunities for developers and businesses.

Identifying Idle Paths in a Data Center Leaf-Spine Fabric

In leaf-spine data center networks, traffic often becomes imbalanced, leaving some uplinks idle and resulting in wasted bandwidth. Kentik helps engineers identify underutilized paths, diagnose the causes, and take corrective action using enriched telemetry, visual topology maps, and intelligent alerts, turning hidden inefficiencies into actionable insights.

Peacetime Observability: Spotting Risks Before They Become Incidents

Most of the time, nothing’s broken. Traffic’s flowing, alerts are quiet, and everything seems fine. That’s peacetime, when no one’s getting paged. Coroot helps in both peacetime and wartime. When things go wrong, it guides you to the root cause fast. But during peacetime, it helps you spot risks early, clean up inefficiencies, and prevent those incidents from happening in the first place.

Graylog vs ELK: Which Log Management Solution Fits Your Stack?

Your app logs start simple—maybe a few print() or logging.info() calls. But in production, things get noisy. Thousands of log lines per minute, scattered across services, and it’s hard to know what matters. This is when tools like Graylog and the ELK stack help. They let you collect, search, and make sense of logs, but they do it in different ways. This guide breaks down how each one handles setup, scale, and day-to-day use.

How to Monitor and Manage Grafana Memory

It’s late, you get an alert, and Grafana is down. The reason? It ran out of memory. If you’ve ever watched Grafana slowly eat up RAM until it just stops responding, you know how frustrating that can be. Memory can spike quickly, especially with complex dashboards and multiple data sources. This guide will help you understand what’s going on and how to keep Grafana running without surprises.

Introducing our improved uptime check

The past few months, we’ve working on improving our uptime check. We proud to announce that this improved check is now available for all users. You don’t have to do anything to get it (unless you are not subscribed to Oh Dear, in that case your should subscribe to Oh Dear ), all our users now have it enabled by default. In this blogpost, I’d like to give an overview of the changes and some background why we changed some things.