Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Log Management, Log Analytics and related technologies.

Taming Log Noise With the OpenTelemetry Collector's Drain Processor

Do you receive 50 million log lines per day and struggle to see what actually matters? Health checks, heartbeat pings, connection pool messages—they all drown out the errors and anomalies you're trying to find. Most teams deal with this by writing filter rules to drop the noisy patterns. But those rules are manual, per-pattern, and brittle. A new deployment changes a log format and the filter misses it. A new service starts logging a chatty startup sequence nobody thought to exclude.

May the Logs Be With You: Graylog 7.1 Is Here

A long time ago, in a SOC far, far away…analysts were drowning in alerts, chasing context across fragmented screens, and watching real threats slip past detection gaps. Today, the Rebellion fights back. This isn’t a release built around a single marquee feature. It’s the result of our team listening to you on the front lines with an ear for removing the friction that makes your jobs harder than they need to be.

Federated Search | From Silos to Insight | Unified Datasets in AWS S3 with Ingest Processor

Are storage costs and data silos slowing down your investigations? In this video, we dive into the Unified Dataset Experience to show you how to search data where it lives. Learn how to use the Splunk Ingest Processor to route high volume logs directly to AWS S3 while maintaining instant visibility via Federated Search. No more re-hydrating data, just fast cost-effective insights.

How the Coralogix CLI Adds Production Intelligence to Any Agent for Any Use Case

The new interface into production telemetry is a tool call, made from whichever agent runtime the operator happens to be using at that moment. A finance lead in Claude Code, a product manager in Cursor, an engineer in Codex. Three different jobs, three different agents, three different reasoning loops. The thing they have in common is the data layer underneath.

Real-Time Database Monitoring: Solving Database Latency with Zero-Code eBPF Tracing

In high-throughput database environments, a latency spike is rarely a simple story. Modern data layers are distributed, stateful, and constantly changing as shards move, nodes rebalance, caches warm, queries evolve, and connections churn. In practice, spikes usually come from one of three places: For many SRE and Platform teams, the real challenge is disconnected tooling. As one engineering lead recently shared during a technical workshop: “It’s all disconnected.
Sponsored Post

Understanding the Three Pillars of Observability: Logs, Metrics and Traces

Many people wonder what the difference is between monitoring vs. observability. While monitoring is simply watching a system, observability means truly understanding a system's state. DevOps teams leverage observability to debug their applications, or troubleshoot the root cause of system issues. Peak visibility is achieved by analyzing the three pillars of observability: Logs, metrics and traces. Depending on who you ask, some use MELT as the four pillars of essential telemetry data (or metrics, events, logs and traces) but we'll stick with the three core pillars for this piece.

Your Team is Using Claude Code. Do You Know What It's Costing You?

The first two weeks of Claude Code are exciting. The third week is when you realize you don’t have visibility into what it’s doing or what it’s costing you. You would not run a production service without metrics, logs, and dashboards or deploy an API without knowing its latency, error rate, or cost per request.

Coralogix and Atlassian: Full-Stack Observability Inside the Incident Workflow

Incident response has a well-known efficiency problem. The tools teams use to detect and investigate issues are often disconnected from the tools they use to manage and resolve them. Engineers spend a significant portion of each incident switching between platforms, assembling context that should already be at hand. Even when the data is available, correlating signals across user, app, infrastructure, and security events to pinpoint a root cause remains manual and slow.