Operations | Monitoring | ITSM | DevOps | Cloud

Logging

The latest News and Information on Log Management, Log Analytics and related technologies.

Better, Faster, Stronger Network Monitoring: Cribl and Model Driven Telemetry

New in Cribl 4.5, the Model Driven Telemetry Source enables you to collect, transform, and route Model Driven Telemetry (MDT) data. In this blog, you’ll learn how to explore the YANG Suite to understand the wide variety of datasets available to transmit as well as how to configure the tools to get data flowing from Cisco IOS XE network devices to Cribl Stream.

Crossing the machine learning pilot to product chasm through MLOps

Numerous companies keep launching AI/ML features, specifically “ChatGPT for XYZ” type productization. Given the buzz around Large Language Models (LLMs), consumers and executives alike are growing to assume that building AI/ML-based products and features is easy. LLMs can appear to be magical as users experiment with them.

Why You Need Observability With the Splunk Platform

Splunk’s extensible and scalable data platform has been instrumental in helping ITOps teams fully understand their tech environments and tackle any IT use case with data streaming, dashboarding, federated search, AI/ML, and more. But, with the explosion of telemetry and the growing complexity of digital systems, ITOps practitioners who rely solely on a logging solution are missing out on critical insights from their digital systems.

5 reasons why observability and security work well together

Site reliability engineers (SREs) and security analysts — despite having very different roles — share a lot of the same goals. They both employ proactive monitoring and incident response strategies to identify and address potential issues before they become service impacting. They also both prioritize organizational stability and resilience, aiming to minimize downtime and disruptions.

The UK Telecommunication Security Act (TSA): When Life Gives You Lemons, Make Lemonade

On October 1, 2022, the UK Telecommunications Security Act (TSA) went into effect, imposing new security requirements for public telecom companies. The purpose of the act is noble, as it wants to ensure the reliability and resilience of the UK telecommunications network that underpins virtually every aspect of the economy and modern society.

The Leading Stackify Alternatives

Stackify Retrace is an application performance management (APM) and log management platform designed to assist developers and DevOps teams in tracking, troubleshooting, and enhancing the performance of their applications and infrastructure. Stackify Retrace effectively combines APM with log management, enabling users to view detailed transaction traces for applications directly from the log statement to provide greater context and visibility for more effective analysis.

SRECon Recap: Product Reliability, Burn Out, and more

I recently attended SRECon in San Francisco on March 18 - 20, a show dedicated to a gathering of engineers who care deeply about site reliability, systems engineering, and working with complex distributed systems at scale. While there were a lot of talks, I’ll focus on a few areas that gave me the most insight into how having the right data impacts an SREs and an organization’s success.

How to Threat Hunt in Amazon Security Lake

Establishing a proactive security posture involves a data-driven approach to threat detection, investigation, and response. In the past, this was challenging because there wasn’t a centralized way to collect and analyze security data across sources, but with Amazon Security Lake it is much simpler.

Getting started with the Elastic AI Assistant for Observability and Microsoft Azure OpenAI

Recently, Elastic announced the AI Assistant for Observability is now generally available for all Elastic users. The AI Assistant enables a new tool for Elastic Observability providing large language model (LLM) connected chat and contextual insights to explain errors and suggest remediation.