Operations | Monitoring | ITSM | DevOps | Cloud

Sponsored Post

How to Discover Advanced Persistent Threats in AWS

When it comes to managing AWS cloud security, a growing concern for security operations (SecOps teams) is the increasing sophistication of digital threats. While conventional cyber threats deploy widely known tools and techniques in crude, all-or-nothing attempts to breach enterprise security controls, sophisticated attacks known as Advanced Persistent Threats (APTs) employ more advanced technologies and methods to gain and maintain access to secure systems for long periods of time.

Create a Splunk pipeline to filter, mask, and route logs - without SPL2

In this video, we will take a look at how you can create a Splunk Data Management pipeline to filter, mask and route your logs with using any SPL2 code. For this demo we have used Ingest Processor to build our pipeline but the same concept can be used for Edge Processor as well.

Finding Your Way: Using Metrics to Explore Organizational Architecture

Imagine being the new developer in a bustling tech company. Everyone is rushing to meet deadlines, and no one has time to explain the tangled web of services, databases, and messaging systems that make up the organization’s architecture. You search high and low for documentation, but the few diagrams you find are outdated or incomplete. Feeling lost? This is where metrics can come to the rescue.

Realizing the business value of OpenTelemetry-native observability

Transform your organization's observability strategy with open standards and simplified data collection Modern organizations face an unprecedented observability challenge. As systems grow more complex and distributed, traditional monitoring approaches are struggling to keep pace. With data volumes doubling every two years and systems spanning multiple clouds and technologies, organizations need a new approach to maintain visibility into their operations.

Integrating Google SecOps with Bindplane January 2025

Google SecOps (formerly Chronicle) is Google Cloud's security operations platform (SIEM) that helps you detect, investigate, and respond to cybersecurity threats. Integrating Bindplane enables an easy way of standardizing how you efficiently collect, process, and forward security-relevant data to Google SecOps. In this webinar you’ll get a hands-on demo of how to configure log collection with the BindPlane Agent, and best practices for data standardization using open standards and OpenTelemetry. This will let you focus on the important task of investigating threats with Google SecOps instead of configuring telemetry pipelines.

Getting the Most Out of Java With SolarWinds Loggly

Logs are a developer’s first line of defense when monitoring and troubleshooting distributed applications. They provide insights into performance, user behavior, and application stability, whether your application is written in Java or another language. However, when your applications scale up, and you have fragmented log data scattered across different systems, this will complicate your troubleshooting effort. This is what makes a centralized logging tool like SolarWinds Loggly essential.

Global data mesh for public sector organizations

The sheer volume of data, often siloed and lacking interoperability, can make it challenging to get a big-picture, accurate view across complex public sector environments. With a global data mesh, you gain fast access to all potentially relevant information, regardless of source, format, or location.

Micro Lesson: Introduction to Sumo Logic Mo Copilot

The video introduces Sumo Logic's Mo Copilot, an AI-powered assistant that simplifies complex query creation using natural language, making it accessible for users of all skill levels. Mo Copilot enhances productivity by providing AI-driven insights and recommendations, allowing teams to detect and resolve incidents more efficiently. It consolidates logs into a unified view, improving collaboration and decision-making. Overall, Mo Copilot transforms the way security and development teams work with data.

The power of cloud native observability

Unstructured data clouding your observability goals? Learn why monitoring alone cannot solve business-critical performance issues as Sr. Director of Technical Marketing Adam White explains how combining structured and unstructured data with real-time analytics unlocks dynamic insights into root cause analysis and performance management in the cloud.

The problem with traditional log management

Logs are everywhere and contain valuable information that can make or break everything from security investigations to avoiding an outage, but legacy log management systems are inefficient for modern organizations generating more data than ever before. Sr. Director of Technical Marketing Adam White offers guidance on the pitfalls of traditional log management and what your organization can do today to jumpstart your digital transformation journey!

Reimagining Log Management Tools and Software: The Impact of AI and GenAI

Today’s distributed, cloud-native systems generate logs at a high rate, making it increasingly difficult to derive actionable insights. AI and Generative AI (GenAI) technologies—particularly large language models (LLMs)— are transforming log management tools by enabling teams to sift through this data, identify anomalies, and deliver real-time, context-rich intelligence to streamline troubleshooting.

Why Data Tiering is Critical for Modern Security and Observability Teams

In today's digital landscape, security and observability teams face an unprecedented challenge: managing massive volumes of data while maintaining both performance and cost-effectiveness. As organizations generate more data than ever before, the traditional approach of storing everything in high-performance, expensive systems is becoming unsustainable. How will your team evolve how it manages and uses telemetry data across the enterprise?

What Are Syslog Levels and Why Should You Care?

Syslog is a foundational part of logging in Linux and Unix-based systems, helping engineers efficiently capture and analyze system events. Among its core components, syslog levels play a crucial role in categorizing logs based on their severity. Understanding these levels can significantly improve troubleshooting, monitoring, and alerting strategies.

Error Logs: What They Are, Why They Matter, and How to Use Them

Whether managing a web application, monitoring an API, or tracking system performance, error logs are your first defense in troubleshooting and improving your systems. However, understanding them beyond the basics can make all the difference in diagnosing complex issues and enhancing the overall user experience. In this in-depth guide, we’ll explore everything you need to know about error logs, including how to read them, why they matter, and some tricks to make them work for you.

The Power of Structured Logging: Why It Matters in Modern Development

Structured logging has emerged as a crucial aspect of modern application development and monitoring. Unlike traditional logging, structured logging organizes log data into a defined format, often in JSON or XML, making it easier to parse, search, and analyse. This practice simplifies troubleshooting, enhances observability, and supports seamless integration with monitoring tools.

The Importance of Data Normalization for Log Files

Imagine sitting in an airport’s international terminal. All around you, people are talking to friends and family, many using different languages. The din of noise becomes a constant thrum, and you can’t make sense of anything – not even conversations in your native language. Log data is similar to this scenario. Every technology in your environment generates log data, and information about the activities happening from logins to processing.

Generative AI QE: Insights from testing Sumo Logic Mo Copilot

Generative AI is transforming industries by automating tasks and delivering AI tools, such as AI assistant Sumo Logic Mo Copilot, to enhance operational efficiency. But, these advancements also challenge traditional quality engineering (QE) methodologies. Unlike conventional software testing, AI models produce dynamic, context-sensitive outputs, requiring a new approach to validation and testing. At Sumo Logic, we faced similar challenges while testing Mo Copilot.

Comparing Azure NSG and VNet Flow Logs

Phil Gervasi compares Azure NSG Flow Logs and VNet Flow Logs, explaining the benefits VNet Flow Logs bring to network observability in Azure environments. Learn how VNet Flow Logs simplify network monitoring, improve traffic visibility, and address the limitations of NSG Flow Logs by capturing traffic at the virtual network level. Learn about VNet Flow Log applications—including traffic analysis, network optimization, and security enhancement—and how Kentik integrates with these logs for deeper insights and advanced analytics.

Getting it right with GenAI in financial services: Where to focus in 2025

I attended ElasticON recently where we spent the day with our NYC Elastic community, talking about the combined value of vector databases using retrieval augmented generation (RAG) to feed large language models (LLMs) for next-level generative AI (GenAI) results. Elastic’s CTO and Founder Shay Banon kicked off his keynote with an important message: GenAI is not magic.

A Guide to Logging and Debugging in Java

During the development of your program, you might rely on simple println() statements to trace program execution flows and identify issues in your code. But as projects grow in size and complexity, print statements quickly become messy. A better approach to tracing program execution is logging, an approach that provides a consistent and organized way to track your application’s behavior, allowing you to systematically identify and resolve issues.

AI in Observability: Mapping Root Causes with Precision

Explore how AI is transforming observability by mapping system connections and uncovering root causes with precision. The Logz.io AI Agent analyzes logs, metrics, and service dependencies to provide actionable insights without the need to sift through overwhelming amounts of data.

AI-Powered Log Management: Faster Troubleshooting with Logz.io

Managing logs in a fast-paced cloud-native world can be tough. Log data is growing, and traditional tools just can’t keep up. That’s where Logz.io comes in—a log management and analytics platform powered by AI to make troubleshooting, performance monitoring, and collaboration faster and easier than ever.

How to run Loki at scale on Kubernetes (Loki Community Call January 2025)

Happy New Year from the Loki Engineering team. To kick off 2025, Nicole and Jay will be joined by Poyzan Taneli from the Loki Engineering team to discuss how to run Loki at scale on Kubernetes. If you are currently running Loki in microservices mode or preparing to do so, we will be discussing best practices for scaling its components to meet the demands of production use cases.

Optimizing long-term data retention with Elastic Cloud Hosted: Ensuring compliance and efficiency for government

In the digital era, state and local governments are increasingly tasked with managing vast volumes of data while ensuring compliance with stringent regulatory requirements. These regulations, which can vary significantly depending on jurisdiction, often require the retention of data for extended periods — sometimes ranging from one to seven years.

KubeCon 2024 | Interviews with Observability Experts | Observability Insights with Josh Lee

Join me at KubeCon 2024 as I sit down with Josh Lee, Developer Advocate at Altinity, to discuss the latest trends, challenges, and insights in observability. In this interview, we cover key topics such as OpenTelemetry adoption (including the Open Agent Management Protocol), data sovereignty, standardization through semantic conventions, and the need to unify observability tooling across organizations.

Cribl Surpasses $200M ARR!

I’m so excited to share that Cribl recently surpassed $200 million in annual recurring revenue! This milestone and our rapid growth comes down to one thing: Solving real problems for our customers. The more our customers partner with us and use Cribl products to simplify their telemetry data management, the more our business grows and the more milestones we’ll hit together. Thank you to our fantastic customers and partners who have helped us reach this point in our journey!

Fast-Track Kubernetes Observability with Logz.io and OpenTelemetry: A quick getting started guide

In formal terms, OpenTelemetry is an open source framework used for instrumenting, generating, collecting, and exporting telemetry data for applications, services, and infrastructure. It provides vendor-neutral tools, SDKs and APIs for generating, collecting, and exporting telemetry data such as traces, metrics, and logs to any observability backend, including both open source and commercial tools.

SLF4J vs Log4j: Key Differences and Choosing the Right One

When building robust, maintainable, and scalable Java applications, logging plays an essential role in debugging, monitoring, and ensuring smooth performance. Two of the most widely used logging frameworks in the Java ecosystem are SLF4J and Log4j. While both serve similar purposes, they offer different approaches and features, making it important to understand their differences before making a choice.

Serilog: Configuration, Error Handling & Best Practices

When building modern.NET applications, logging is one of those things you don’t want to get wrong. Serilog steps in as a popular logging framework that has earned its spot as a go-to tool for developers. Why? Because it’s flexible, versatile, and does an awesome job of giving you clear insights into your app's behavior. But what exactly is Serilog?

Log Levels: Different Types and How to Use Them

When you're working with logs in software development, one key thing to understand is log levels. They help us organize log messages, making it easier to find and analyze the most important ones. In this guide, we'll walk through what log levels are, why they matter, and how to use them effectively. Let’s get started!

Must-Have Features for Your Log Management Software

With so many choices available to us today, choosing log management software that’s just right for us has never been simpler. That is, if you know exactly what it is you are looking for. But for many users, the sheer amount of computer programs that perform the same tasks, and seem so similar(sometimes almost identical) to each other, can quickly become off-putting and confusing.

Open source log management tools in 2025

Log management tools provide visibility into the performance and behavior of systems, applications, networks, and infrastructure components. By collecting and analyzing logs, you can monitor for anomalies, track trends, and identify potential issues before they escalate. Choosing the right log management solution requires careful consideration of several factors to ensure that it meets your specific needs and goals. Here are the most popular open source log management tools to help you choose.

Golang Monitoring using OpenTelemetry

When it comes to monitoring Golang applications, there are various tools and practices you can use to gain insights into your application's performance, resource usage, and potential issues. By using OpenTelemetry for monitoring in your Go applications, you can gain valuable insights into the behavior, performance, and resource utilization of your distributed systems, allowing you to troubleshoot issues, optimize performance, and improve the overall reliability of your software.

How Telemetry Pipelines Save Your Budget

This is an updated version of an earlier blog post to reflect current definitions of a telemetry pipeline and additional capabilities available in Mezmo Our recent blog post about observability pipelines highlighted how they centralize and enable telemetry data actionability. A key benefit of telemetry pipelines is users don't have to compare data sets manually or rely on batch processing to derive insights, which can be done directly while the data is in motion.

How to Use Static Thresholds for Effective Alerts in Splunk Observability Cloud

In this video, we explore the concept of static thresholds, which are a foundational tool in your observability alerting solution. You’ll learn: Additionally, we will demonstrate static thresholds in Splunk Observability Cloud. We’ll configure a static threshold for AWS EC2 memory utilization. We’ll also look at additional threshold settings like trigger sensitivity and duration. By the end of this video, you'll have the knowledge to effectively incorporate static thresholds into your observability strategy.

Introducing Logsene CLI

In vino veritas, right? During a recent team gathering in Kraków, Poland, and after several yummy bottles of țuică, vișinată, żubrówka, diluted with some beer, the truth came out – even though we run Logsene, a log management service that you can think of as hosted ELK Stack, some of us still ssh into machines and grep logs! Whaaaaat!? What happened to eating our own dog food!?

Take control of your OpenTelemetry Collectors with Otel Remote Management

Managing OpenTelemetry (OTel) collectors across diverse, cloud-native environments is key to streamlining monitoring and gathering valuable insights. But, managing them effectively, especially across multiple servers, has been a manual and time-consuming process. That changes today. Sumo Logic’s Otel Remote Management is designed to simplify OpenTelemetry Collector management, all from a single unified user interface.

Master Telemetry Replay with Cribl Stream and Cribl Lake

What do you do when an incident occurs, and you need to investigate and troubleshoot? Replay data. What about performing audit trails for compliance and reporting? Replay data. Need to do system testing and validation? Replay data. There are countless reasons to replay telemetry, but the ease of doing so largely depends on the tools and infrastructure you have in place. Manual replay is often cumbersome and time-consuming, requiring access to stored raw data in logs or files.

Logz.io Earns Special Mention for Best Use of AI from the 2024 O11ys Awards

We’re thrilled to announce that Logz.io received a Special Mention for Best Use of AI from the 2024 O11ys Awards, a celebration of innovation and excellence in observability. The 2024 O11ys Awards recognized our AI Agent, calling it: This recognition validates our mission to simplify observability with AI, empowering teams to troubleshoot faster, optimize costs, and focus on innovation.

Trusting Cribl: Strengthening Your Software Supply Chain with Transparency and Security

Let’s face it—the term "software supply chain" can feel like navigating a maze of tech jargon. Commit signing, Software Composition Analysis (SCA), eBPF monitoring, SBOM generation, provenance attestations… the list goes on. But at its core, the software supply chain is the backbone of modern development, and its security is non-negotiable. A single vulnerability in this chain can ripple through entire systems, leading to breaches, downtime, and reputational damage.

Top 13 Splunk Alternatives in 2025: From Open Source to Enterprise Solutions

Splunk is a powerful tool for data analysis and monitoring, but its high costs and complex implementation can be challenging for many organizations. Here are 13 proven Splunk alternatives that provide robust monitoring capabilities, comprehensive data analysis, and more cost-effective solutions for organizations of all sizes.

Why Observability Needs AI: Revolutionizing Monitoring for Modern Complex Systems

In this insightful talk, Asaf Yigal, Co-founder and VP of Product at Logz.io, shares the turning point in observability: addressing the growing complexity of modern environments with AI-driven solutions. From Kubernetes to multi-cloud infrastructures, traditional observability tools fall short in solving complex problems. Discover how Logz.io leverages artificial intelligence to simplify monitoring, enhance troubleshooting, and revolutionize how companies tackle observability challenges. Learn why smarter, AI-powered tools are the future of observability.

Introducing GenAI for Observability: Root Cause Analysis Made Easy

Discover how Logz.io is transforming observability with GenAI, enabling you to troubleshoot complex problems and optimize cloud configurations effortlessly. In this video, we showcase how GenAI leverages your data to perform advanced root cause analysis, automating the process of identifying and resolving exceptions in modern, complex environments. Learn how GenAI analyzes deployment changes, workload patterns, and configuration updates to provide a detailed report in under a minute. Say goodbye to manual troubleshooting and hello to smarter, AI-powered insights.

The Future of Observability: Embracing Change with AI-Driven Insights

Discover how AI is revolutionizing observability and transforming the way we work. In this insightful talk, we explore the parallels between the adoption of Google search and the shift toward natural language-driven observability. Learn why outdated methods like manual graphs, alerts, and extensive data storage are becoming obsolete. It’s time to embrace change, ask questions naturally, and get the answers you need—effortlessly.

Revolutionizing Root Cause Analysis with Generative AI: The RAG Approach and Multi-Agent Models

Explore how cutting-edge Generative AI techniques are transforming root cause analysis and troubleshooting. This video dives into the innovative use of the RAG (Retrieval-Augmented Generation) approach to combine past data with real-time information and multi-agent models for dynamic problem-solving. Learn how AI agents ask follow-up questions, analyze data, and deliver highly accurate results like never before.

Structured Logging Best Practices: Implementation Guide with Examples

In structured logging, log messages are broken down into key-value pairs, making it easier to search, filter, and analyze logs. This is in contrast to traditional logging, which usually consists of unstructured text that is difficult to parse and analyze.

Using SolarWinds Loggly to Get the Most Out of MongoDB Structured Logging

Logs are essential for understanding and optimizing performance, and MongoDB structured logging makes them more powerful. By organizing logs into a consistent format, we can query and analyze them more efficiently. However, dealing with logs locally has its limits. That’s where a centralized log management tool like SolarWinds Loggly comes in. Shipping MongoDB logs to SolarWinds Loggly gives you a unified view of your data, advanced analytics, and proactive monitoring.

Observability Insights From KubeCon 2024 - Summary

In this video, I’m breaking down the biggest themes and key takeaways from KubeCon 2024’s observability sessions. From OpenTelemetry’s growing role as the standard for telemetry data to how AI and continuous profiling are shaping the future of proactive, scalable and cost-effective observability. If you missed KubeCon 2024 or want to stay on top of observability trends, this recap will get you up to speed in just a few minutes.

APAC in 2025: A Harder Look at AI, Data and Cybersecurity Standards

This year has been transformative for technology, reshaping the business landscape with groundbreaking advancements and unprecedented challenges. Generative AI continues to unlock new possibilities, while cybersecurity threats have escalated to new heights. Across APAC — a fast-emerging global innovation hub — businesses have grappled with the twin forces of regulatory evolution and technological breakthroughs.

Learn SPL Command Types: Efficient Search Execution Order and How to Investigate Them

When performing searches, Splunk uses its own language, SPL (Search Processing Language). SPL commands can be categorized into several types depending on the processing they perform. Especially in a distributed environment where the Splunk system is made up of multiple servers, if you do not understand which components perform heavy processing depending on the SPL type, you may create inefficient searches.

Navigating 2025: Turning Uncertainty into Opportunity

The end of the year for technology companies always brings with it a raft of new predictions for the coming twelve months. Many predictions, breathlessly delivered, suggest a tenuous future can be conveniently avoided with the appropriate application of vendors’ products. Using predictions as a way to shill products is boring, and it misses an opportunity to help enterprises plan for the coming year. After all, predictions don’t have to be correct to be useful.

Business Intelligence (BI): What It Means for Your Organization

Data drives the modern business world, and organizations capable of leveraging it effectively maintain a significant edge over their competition. Business Intelligence (BI) has emerged as a critical tool, enabling companies to turn raw data into actionable insights. But what exactly is BI? This blog explores everything you need to know about business intelligence, from its components to use cases, to implementation strategies.

Latest Product Updates and Features in Logz.io | January 2025

We’re thrilled to launch our brand-new and improved Support Help Center, designed to streamline how you interact with our support team and access the resources you need. This enhanced platform empowers users to: This is more than just a support portal—it’s a centralized hub to enhance your experience, provide solutions faster, and keep your feedback front and center in our development process. Explore our new Support Help Center for answers and assistance!

Centralized Log Management for the Digital Operational Resilience Act (DORA)

The financial services industry has been a threat actor target since before digital transformation was even a term. Further, the financial services organizations find themselves continuously under scrutiny. As members of a highly regulated industry, these companies need to comply with various laws to ensure that they effectively protect sensitive data.

Starting 2025 on a High Note: Coralogix Bags 126 G2 Winter Badges

As the holiday season comes to an end and we step into 2025 with renewed energy and excitement, Coralogix kicks off the year with a remarkable gift of achievements! In the G2 Winter 2025 Reports, we are thrilled to announce that we’ve been recognized with a phenomenal 126 badges across multiple categories and market segments. This remarkable feat is a testament to the trust and love of our customers and the dedication of our team.

Enterprise guide to streamlined log collection using Site24x7

Handling logs in a large-scale server infrastructure is no small task. It’s a critical component of maintaining smooth operations, especially for industries like healthcare, where over 1,000 servers might be managing everything from patient records to billing systems. When these logs are scattered and disconnected, this disarray slows troubleshooting, fragments operational insights, and ultimately undermines system reliability.

Coralogix at AWS re:Invent 2024 Highlights

We had a blast at AWS re:Invent 2024 and our team was invigorated by the incredible response and feedback we received from the thousands of participants who visited our booth. It was clear that a recurring theme among companies is the need for an observability solution that not only scales affordably with increasing data volumes but is also at the forefront of innovation. Coralogix stands out as the ideal match for these requirements.

Migrating from DIY ELK to a Full SaaS platform

Managing modern systems requires a constant balance between operational efficiency and innovation; going a little further, maintaining seamless operations and delivering exceptional customer experiences increasingly depend on ensuring robust observability. For years, the ELK stack (Elasticsearch, Logstash, Kibana) has been the go-to solution for many organizations for log management and observability, offering flexibility control and an open source approach.

How to send OTLP or Prometheus metrics and logs to Grafana Cloud with Grafana Alloy

We introduced Grafana Alloy last year in an effort to create the best possible open source “big tent” telemetry collector. A continuation of our work on Grafana Agent Flow, we designed Alloy to simplify observability at scale and to easily integrate with the OpenTelemetry and Prometheus ecosystems. We’ve seen lots of interest since Alloy was announced at GrafanaCON 2024, and industry observers are taking notice, too.

The Best Real-Time Data Streaming Tools

For organizations, it is crucial to swiftly respond to evolving market dynamics, shifting customer preferences, and emerging operational challenges. This responsiveness is made possible through the use of real-time data streaming technologies, which provide a dynamic and profound understanding of the environment. In this article, we will outline why real-time data streaming is beneficial before listing the leading real-time data streaming tools currently available.