Operations | Monitoring | ITSM | DevOps | Cloud

Logging

The latest News and Information on Log Management, Log Analytics and related technologies.

How we reduced flaky tests using Grafana, Prometheus, Grafana Loki, and Drone CI

Flaky tests are a problem that are found in almost every codebase. By definition, a flaky test is a test that both succeeds and fails without any changes to the code. For example, a flaky test may pass when someone runs it locally, but then fails on continuous integration (CI). Another example is that a flaky test may pass on CI, but when someone pushes a commit that hasn’t touched anything related to the flaky test, the test then fails.

15 Best Tools to Test and Measure Core Web Vitals [2023 Comparison]

User experience is key to ensuring the success of your website. There are many metrics that help you gauge and improve it, but Core Web Vitals are probably the most important ones. They are a set of real-world, user-centered metrics that quantify key aspects of the user experience. By measuring dimensions of web usability such as load time, interactivity, and the stability of content as it loads, Core Web Vitals help you understand how your website is doing in terms of performance.

How Can the Right Log Aggregator Help Your Enterprise?

The Internet of Things (IoT) revolution has set the beginning of a new age of data transfer. Each day, a massive number of new devices get added to all kinds of network infrastructures, transferring gargantuan amounts of data back and forth. In the next decade, we expect the number of IoTs to grow to a staggering 80 billion connected devices – practically outnumbering the human population tenfold.

Guide on Structured Logs [Best Practices included]

Structured logging is the method of having a consistent log format for your application logs so that they can be easily searched and analyzed. Having structured logs allows for more efficient searching, filtering, and aggregation of log data. It enables users to extract meaningful information from log data easily. Logging is an essential aspect of system administration and monitoring. Logging allows you to record information data about the application's activity.

How Security Engineers Use Observability Pipelines

In data management, numerous roles rely on and regularly use telemetry data. The security engineer is one of these roles. Security engineers are the vigilant sentries, working diligently to identify and address vulnerabilities in the software applications and systems we use and enjoy today. Whether it’s by building an entirely new system or applying current best practices to enhance an existing one, security engineers ensure that your systems and data are always protected.

Democratizing Machine Data & Logs: How Infor saves millions by leveraging Sumo Logic's data-tiering

Infor developed a decentralized governance model for managing the vast Sumo landscape. A landscape with thousands of users, tens of thousands of Collectors, petabytes of log ingestion. By democratizing log management, we implemented a decentralized governance model for our Sumo account. Consequently,, we succeeded in doubling our log ingestion year-over-year, while reducing our log ingestion cost by more than 50%.

Is Kubernetes Monitoring Flawed?

Kubernetes has come a long way, but the current state of Kubernetes open source monitoring is in need of improvement. This is in part due to the issues related to an unnecessary volume of data related to that monitoring. For example, a 3-node Kubernetes cluster with Prometheus will ship around 40,000 active series by default. Do we really need all that data?

Connecting OpenTelemetry to AWS Fargate

OpenTelemetry is an open-source observability framework that provides a vendor-neutral and language-agnostic way to collect and analyze telemetry data. This tutorial will show you how to integrate OpenTelemetry with Amazon AWS Fargate, a container orchestration service that allows you to run and scale containerized applications without managing the underlying infrastructure.

Root cause log analysis with Elastic Observability and machine learning

With more and more applications moving to the cloud, an increasing amount of telemetry data (logs, metrics, traces) is being collected, which can help improve application performance, operational efficiencies, and business KPIs. However, analyzing this data is extremely tedious and time consuming given the tremendous amounts of data being generated. Traditional methods of alerting and simple pattern matching (visual or simple searching etc) are not sufficient for IT Operations teams and SREs.