Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Log Management, Log Analytics and related technologies.

Elastic Cloud Serverless now available in technical preview on Microsoft Azure

Elastic Cloud Serverless provides the fastest way to start and scale security, observability, and search solutions — without managing infrastructure. Today, we are excited to announce the technical preview of Elastic Cloud Serverless on Microsoft Azure — now available in the EastUS region. Elastic Cloud Serverless provides the fastest way to start and scale security, observability, and search solutions without managing infrastructure.

Latest Product Updates and Features in Logz.io | February 2025

We’re excited to announce a series of upgrades to our AI Agent, Log Management Explore UI and core integrations designed to empower you with even deeper observability and streamlined operations. These updates enhance account visibility, multi-telemetry trace insights, and logging capabilities while ensuring seamless compatibility with OpenTelemetry. Read on to discover how these enhancements can help you gain more clarity and control over your environment.

Full Guide to Linux Disk IO Monitoring, Alerting and Tuning

Disk IO (Input/Output) is a core aspect of system performance. Whether you’re managing a database, a web application, or a cloud server, how efficiently your system reads and writes data affects everything from response times to stability. Unlike high CPU usage or memory bottlenecks that often manifest immediately, disk IO issues tend to creep up silently—until they slow down critical processes.

How to Stop Memory Leaks Before they Crash Your Linux System

Imagine you’ve got a leaky faucet in your kitchen. At first, it’s just a drip here and there—annoying, sure, but not enough to ruin your day. But leave it unchecked, and soon that drip turns into a steady trickle. Your water bill skyrockets, the sink overflows, and before you know it, you’re ankle-deep in chaos. Now, replace that faucet with a Linux system, and you’ve got a memory leak.

5 Ways to Prevent CPU Overload on Linux Servers

Every server administrator’s nightmare starts with a message: “CPU usage at 100%” It’s that critical moment when your Linux server transforms from a reliable workhorse into a sluggish mess, taking your applications and user experience down. We’ve all been there… staring at a terminal, watching load averages climb, while frantically trying to figure out which process decided to throw a CPU-hungry party on our server.

Telemetry Pipeline 101

Are you looking to enhance your observability and gain deeper insights into your systems? Curious about how a Telemetry Pipeline can revolutionize your monitoring and troubleshooting capabilities while keeping the cost low? Join Mezmo’s Bill Balnave (Vice President of Technical Services) for an insightful webinar unraveling Telemetry Pipeline’s key concepts, highlighting its significance in modern software development and operations. Discover how a Telemetry Pipeline enables you to collect, profile, transform, and analyze crucial telemetry data from your applications and infrastructure.

Kubernetes Monitoring and Alerting Made Easy with Splunk Observability Cloud and OpenTelemetry

In this video, I'll show you how to quickly setup monitoring and alerting for your Kubernetes clusters using Splunk Observability Cloud. We’ll start by deploying the Splunk OpenTelemetry Collector using Helm, and then use the Kubernetes Navigator inside Splunk Observability Cloud to view the health of our cluster and the applications it’s hosting. I’ll demonstrate AutoDetect detectors and alerts by intentionally triggering an issue in the cluster and walk through the alerting process. We’ll review the alerts in Splunk Observability Cloud and then resolve the issue in the cluster.

Petabyte Scale, Gigabyte Costs: Mezmo's Evolution from ElasticSearch to Quickwit

At Mezmo, we handle an enormous volume of telemetry data for our customers and ourselves, requiring a robust and efficient search and analytics backend. For years, ElasticSearch served us well, but as our infrastructure grew to a multi-cluster, multi-petabyte scale, we started to see the cracks—rising costs, performance bottlenecks, and scalability concerns. We needed a change, one that would make our system more cost-effective while maintaining speed and reliability.

SSHD Logs 101: Configuration, Security, and Troubleshooting Scenarios

Secure Shell (SSH) is a fundamental tool for remote system administration, and its logs play a critical role in security monitoring, debugging, and compliance. SSHD logs provide insights into authentication attempts, connection successes, failures, and potential intrusions. This guide explores everything you need to know about SSHD logs, including their location, format, analysis, and lesser-known security practices to maximize their effectiveness.