Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

The New Kubernetes Monitoring Experience in Splunk Observability Cloud

In this video, I walk through the three main pieces of the new Kubernetes monitoring experience in Splunk Observability Cloud: the Kubernetes overview page for monitoring the status and top issues across your environment, the Kubernetes Entities page for troubleshooting individual instances with correlated metrics, logs, events, and configuration, and the Workload Optimization view for getting actionable recommendations on your CPU and memory resource allocation.

Demo - Selector Platform NOC Operator Workflow

See how Selector transforms NOC operations in real time. This demo walks through a typical workflow - from ingesting massive volumes of network and system data to automatically detecting anomalies, correlating events, and pinpointing true root cause. Instead of chasing alerts across siloed tools, Selector delivers a single, intelligent view - reducing noise, highlighting impact, and accelerating resolution.

Demo - Selector Platform CoPilot Diagnosis

See how Selector’s AI Copilot accelerates issue diagnosis in real time. In this demo, watch how natural language queries and AI-driven insights help teams quickly analyze incidents, surface root cause, and understand impact - without digging through multiple tools. Instead of manual investigation, Selector guides operators to answers faster, reducing noise and speeding up resolution. Built for network and operations teams who need clarity, speed, and smarter troubleshooting.

Demo - Selector Platform Dashboard Validation

See how Selector enables real-time validation and visibility through customizable dashboards. In this demo, watch how teams can quickly monitor network and system performance, validate changes, and track key metrics - all in one unified view. Instead of piecing together data across tools, Selector delivers clear, actionable insights that help teams stay aligned and make faster decisions. Built for network and operations teams who need instant visibility and confidence in their environment.

Demo - Selector Platform Actionable Correlation

See how Selector turns fragmented alerts into actionable insight through intelligent correlation. In this demo, watch how events from across the environment are automatically connected, reducing noise and revealing the true root cause behind incidents. Instead of chasing isolated alerts, teams get a single, clear view of what’s happening and what to do next - faster. Built for network and operations teams who need to cut through noise and resolve issues with confidence.

Apache ActiveMQ High Availability Architecture: The Complete 2026 Guide

The most common Apache ActiveMQ high availability mistake is not a configuration error; it is a false assumption. Teams deploy two broker instances, point clients at both with a comma-separated URL, and label the topology "HA." Then the primary crashes, the secondary does not have the message state, and clients start throwing exceptions while the ops team scrambles.

How to Exclude Health Check Endpoints from Python OTel Traces

Health check endpoints generate thousands of identical, useless spans per day. Here are two production-ready approaches to filter them from your Python OTel traces — and the correctness trap most implementations miss. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

last9-genai: Closing the Conversation Gap in LLM Observability

OpenTelemetry's GenAI instrumentation gives you spans and token counts. It does not give you conversations, workflow cost rollups, or prompts visible in your dashboard. last9-genai is an OTel extension that fills those three gaps — without replacing your existing observability stack. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

State of Observability in Financial Services 2026: From implementation to business impact

The demands on financial services companies are intensifying rapidly. They must not only deliver seamless system performance but also control costs, secure sensitive data, and maximize the value of their observability investments. To navigate these converging pressures, leaders are evolving their approach to system monitoring and telemetry. The 2026 State of Observability in Financial Services research report reveals a fundamental shift in how organizations manage their digital infrastructure.

Digitate is Positioned as a Leader in the IDC MarketScape: Worldwide AIOps 2026 Vendor Assessment

IT operations are in a new era – teams are expected to deliver always-on reliability, absorb constant change, manage runaway telemetry volumes, and still prove business impact. The IDC MarketScape: Worldwide AIOps 2026 Vendor Assessment (doc, March 2026) offers ITOps leaders a valuable lens on the AIOps landscape and the providers shaping what comes next.