Operations | Monitoring | ITSM | DevOps | Cloud

Lightweight Server Monitoring - One Binary, No Stack

Monitoring a single server should not require running four daemons. Yet the default open-source recipe for “I just want to watch this one box” still looks like this: install node_exporter, stand up a Prometheus server to scrape it, add Grafana to draw the graphs, and bolt on Alertmanager so you actually hear about a full disk. That is a lot of moving parts — and a lot of YAML — for one machine. This post shows a lighter path.

You don't need a paid plan to use AI Root Cause Analysis

When an error appears in production, the hardest part often isn’t seeing what broke. It’s understanding why. That’s why we built Root Cause Analysis (RCA). It helps connect the dots between an error and its likely cause, so you can spend less time investigating and more time moving forward. Until now, RCA was only available through plans that included AI credits. Starting today, free plan users can purchase an AI credit subscription and use RCA without changing plans.

Splunk Observability at Cisco Live: Agentic Observability for the AI Era

Observability has always been about seeing clearly under pressure. But the pressure has changed. Applications are more distributed. Kubernetes environments keep expanding. Digital experiences depend on services, APIs, networks, third-party providers, and now AI models and agents that can make decisions faster than a human team can review every signal.

Observability Summit NA 2026: What the Community Is Thinking About

Two days in Minneapolis with the OpenTelemetry community, talking about where telemetry pipelines are headed and what the AI wave is doing to them. Two topics dominated everything: AI and cost reduction. Not as separate conversations, either. The more the community talked about AI telemetry, the more the cost question followed right behind it. I joined Diana Todea from VictoriaMetrics and Antonio Jimenez Martinez from Cisco ThousandEyes on the Telemetry That Matters panel.

How LivePerson optimized Logstash and Kafka performance on GCP through benchmarking

By benchmarking five GCP machine types across both Logstash and Kafka, LivePerson's observability team found that infrastructure selection (not just pipeline configuration) is one of the highest-leverage cost optimization decisions at scale.

Service Desk Automation: What It Is and How to Get Started

How much of service desk work is problem solving and how much is repeat work that continues every day? Most service desks follow the same pattern daily. Password resets, access requests, software installs, approvals, and routine fixes keep coming in. These tasks are simple on their own, yet together they take most of the team’s time and push important incidents further down the queue. The main challenge is the constant flow of repeat work that reduces time for focused tasks.

How Support Uses Honeycomb to Debug Honeycomb

You'd think that working at an observability company means everyone knows exactly where to find everything in the data. It doesn't. Especially not on the support team. We're the ones who get the tickets. We're in the telemetry every day trying to figure out what went wrong for a customer, and we do that by pointing Honeycomb at itself. Here's how that actually works, and how it's changed.

May 2026 Early Warning Signals

In May 2026, StatusGator detected 854 Early Warning Signals across SaaS, cloud, developer, and infrastructure services. Of those incidents, 695 were never acknowledged by providers, while 159 were eventually confirmed on official status pages. Throughout the month, StatusGator’s Early Warning Signals continued to surface emerging outages before many providers published updates, giving teams valuable time to investigate and respond.

Microsoft DNS management in OpUtils: One console for complete control

For network administrators, managing DNS has traditionally meant juggling zones and records across separate server interfaces, manually tracking changes, and responding to resolution failures after they’ve already caused disruption. We’re excited to introduce Microsoft DNS management in ManageEngine OpUtils, bringing DNS zone and record administration directly into the same console you already use for IP address management (IPAM).