Latest News

Taming Log Noise With the OpenTelemetry Collector's Drain Processor

May 4, 2026 By Mike Goldsmith In Honeycomb

Do you receive 50 million log lines per day and struggle to see what actually matters? Health checks, heartbeat pings, connection pool messages—they all drown out the errors and anomalies you're trying to find. Most teams deal with this by writing filter rules to drop the noisy patterns. But those rules are manual, per-pattern, and brittle. A new deployment changes a log format and the filter misses it. A new service starts logging a chatty startup sequence nobody thought to exclude.

Read Post

Honeycomb

Read more about Taming Log Noise With the OpenTelemetry Collector's Drain Processor

NVIDIA DCGM Collector: Deep GPU Monitoring for Data Center and AI Infrastructure

May 4, 2026 By Shyam Sreevalsan In netdata

GPU infrastructure is expensive and increasingly central to production workloads. Whether you’re running ML training jobs, inference serving, video transcoding, or HPC workloads, understanding what your GPUs are actually doing, and what’s going wrong when performance degrades, is not optional.

Read Post

netdata

Read more about NVIDIA DCGM Collector: Deep GPU Monitoring for Data Center and AI Infrastructure

Obkio Microsoft Teams Monitoring vs. Microsoft Teams Admin Center

May 4, 2026 By Andrii Kernitskyi In Obkio

Most IT teams rely on Microsoft Teams Admin Center as their default monitoring tool to find and fix Microsoft Teams issues, but there's a gap between what it shows and what actually causes call quality problems. Teams Admin Center gives you Microsoft's perspective on what happened after an MS Teams call ended. It doesn't tell you what was happening on your network, on your users' devices, or in the five minutes before the complaints started coming in.

Read Post

Obkio

Read more about Obkio Microsoft Teams Monitoring vs. Microsoft Teams Admin Center

What Is AWS EKS, and How Does It Work with Kubernetes?

May 4, 2026 By LogicMonitor In LogicMonitor

Amazon EKS is AWS’s managed Kubernetes service for deploying and scaling containerized applications. Amazon Elastic Kubernetes Service (Amazon EKS) is a managed Kubernetes service that simplifies deploying, scaling, and running containerized applications on AWS and on-premises. EKS automates Kubernetes control plane management, ensuring high availability and seamless integration with AWS services like IAM, VPC, and ALB.

Read Post

LogicMonitor

Read more about What Is AWS EKS, and How Does It Work with Kubernetes?

April 2026: IsDown Users Saved 16.5 Hours with Early Outage Detection

May 3, 2026 By Nuno Tomas In isDown

In April 2026, IsDown's early detection system gave users a 3.6-hour head start on a major outage — plenty of time to implement workarounds before the vendor even acknowledged the problem. Across 45 early detections, our users saved a collective 16.5 hours by knowing about outages an average of 22 minutes before official status pages were updated.

Read Post

isDown

Read more about April 2026: IsDown Users Saved 16.5 Hours with Early Outage Detection

Real-Time Database Monitoring: Solving Database Latency with Zero-Code eBPF Tracing

May 3, 2026 By Jonny Steiner In Coralogix

In high-throughput database environments, a latency spike is rarely a simple story. Modern data layers are distributed, stateful, and constantly changing as shards move, nodes rebalance, caches warm, queries evolve, and connections churn. In practice, spikes usually come from one of three places: For many SRE and Platform teams, the real challenge is disconnected tooling. As one engineering lead recently shared during a technical workshop: “It’s all disconnected.

Read Post

Coralogix

Read more about Real-Time Database Monitoring: Solving Database Latency with Zero-Code eBPF Tracing

What Is SNMP? Gain Real-Time Insights Into Network Performance (2026)

May 2, 2026 By LogicMonitor In LogicMonitor

SNMP is the universal protocol for monitoring network infrastructure, but its real value depends on which version you run, how you secure it, and how well your monitoring tool handles the OID work for you. SNMP (Simple Network Management Protocol) is the standard protocol IT teams use to monitor and manage network devices.

Read Post

LogicMonitor

Read more about What Is SNMP? Gain Real-Time Insights Into Network Performance (2026)

Kubernetes Monitoring Tools: What Actually Works at Scale

May 2, 2026 By Faiz Shaikh In Last9

What actually works for Kubernetes monitoring at scale — not what looks good in a vendor demo with a five-pod cluster.

Read Post

Last9

Read more about Kubernetes Monitoring Tools: What Actually Works at Scale

Stop ECS Containers From Collapsing Into One Service in OpenTelemetry

May 2, 2026 By Prathamesh Sonpatki In Last9

Why ECS containers collapse under service.name = aws_ecs and how to fix it for both EC2 launch type and Fargate, including the resource-vs-log-record pitfall that quietly breaks log filtering. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

Read Post

Last9

Read more about Stop ECS Containers From Collapsing Into One Service in OpenTelemetry

Dark Mode Has Arrived

May 2, 2026 By Matt Rideout In DNS Check

It's 2026, and DNS Check now has a dark mode. Yes, we noticed the year. Better late than dazzling our users at 2 a.m. when an MX record decides to misbehave.

Read Post

DNS Check

Read more about Dark Mode Has Arrived

Operations | Monitoring | ITSM | DevOps | Cloud

Taming Log Noise With the OpenTelemetry Collector's Drain Processor

NVIDIA DCGM Collector: Deep GPU Monitoring for Data Center and AI Infrastructure

Obkio Microsoft Teams Monitoring vs. Microsoft Teams Admin Center

What Is AWS EKS, and How Does It Work with Kubernetes?

April 2026: IsDown Users Saved 16.5 Hours with Early Outage Detection

Real-Time Database Monitoring: Solving Database Latency with Zero-Code eBPF Tracing

What Is SNMP? Gain Real-Time Insights Into Network Performance (2026)

Kubernetes Monitoring Tools: What Actually Works at Scale

Stop ECS Containers From Collapsing Into One Service in OpenTelemetry

Dark Mode Has Arrived

Monthly Archive

Follow Us