Monthly Archive

Beyond ping: How OpManager redefines network discovery for modern IT

Jun 30, 2025 By monicaa.mn@zohocorp.com In ManageEngine

Today’s networks aren’t just growing, they’re evolving. Hybrid architectures, cloud-native services, and a never-ending stream of connected devices have made it impossible to keep track of what’s on your network manually. This is exactly where a next-gen network discovery tool becomes a game-changer. ManageEngine OpManager is more than a monitoring solution.

Read Post

ManageEngine

Read more about Beyond ping: How OpManager redefines network discovery for modern IT

VB Transform 2025: The Enterprise AI Revolution Takes Center Stage

Jun 30, 2025 By Shailesh Manjrekar In Fabrix

Fabrix.ai team attended VentureBeat’s – VB Transform conference returned this week as the premier gathering for enterprise AI leaders, showcasing how artificial intelligence has evolved from experimental chatbots to autonomous agents reshaping entire industries.

Read Post

Fabrix

Read more about VB Transform 2025: The Enterprise AI Revolution Takes Center Stage

Proactive Network Protection with Progress WhatsUp Gold 2025: SSL Certificate Monitoring That Helps Prevent Outages

Jun 30, 2025 By Greg Collins In WhatsUp Gold

A single expired SSL certificate can disrupt critical services, erode customer trust, and trigger a series of avoidable issues. That’s why we’re excited to introduce a powerful new feature in Progress WhatsUp Gold 2025.0: Certificate Discovery and Monitoring. This enhancement is more than just a checkbox on a release note; it’s a proactive safeguard designed to help you spot certificate issues before they escalate into business problems.

Read Post

WhatsUp Gold

Read more about Proactive Network Protection with Progress WhatsUp Gold 2025: SSL Certificate Monitoring That Helps Prevent Outages

Install Pandora ITSM from Pandora FMS Console

Jun 30, 2025 By Pandora FMS team In Pandora FMS

Until now, deploying Pandora ITSM required a standalone installation, manual database configuration, and later integration with Pandora FMS. With the new NG 783 version, that entire process has been simplified: Pandora ITSM can now be installed directly from the Pandora FMS web console, no additional servers, no external steps, and with integration already configured.

Read Post

Pandora FMS

Read more about Install Pandora ITSM from Pandora FMS Console

Built for Engineers: Datadog's Vision for the Future

Jun 30, 2025 By Datadog In Datadog

Datadog was built by engineers, for engineers. At, Datadog Co-founder & CEO Olivier Pomel opened the keynote with a clear message: observability, security and AI are converging. From infrastructure to AI Agents, the future of engineering requires one unified platform. Catch all product announcements to see what’s next in observability and security on our Youtube channel!

View Video

Datadog

Read more about Built for Engineers: Datadog's Vision for the Future

Get a better structure in your SCOM environment with the Opslogix Classification Management Pack

Jun 30, 2025 By Jonas Lenntun In OpsLogix

Get a better structure in your SCOM environment with the Opslogix Classification Management Pack Alerts in SCOM can easily become overwhelming, making your environment feel noisy and unstructured. The real challenge is how you can get the right amount of alerts to the right people, at the right time. The Opslogix Classification Management Pack includes features like tiered classification levels, dynamic grouping, and extended tagging.

Read Post

OpsLogix

Read more about Get a better structure in your SCOM environment with the Opslogix Classification Management Pack

Prometheus Gauges vs Counters: What to Use and When

Jun 30, 2025 By Anjali Udasi In Last9

Choosing the wrong metric type in Prometheus can lead to inaccurate dashboards, false positives in alerting, and missed indicators of system failure. Gauge metrics are intended for tracking values that can go up and down, such as memory usage, queue depth, or the number of active connections. Unlike counters, which only increment (or reset on restart), gauges reflect the current state of a resource at scrape time.

Read Post

Last9

Read more about Prometheus Gauges vs Counters: What to Use and When

How we've created a successful FinOps practice at Datadog

Jun 30, 2025 By David M. Lentz In Datadog

When you adopt FinOps to maximize the value of your cloud spending, you may have some simple first steps you can take to gain cost efficiency. For example, you can find and delete any unused resources to quickly realize a one-time optimization. But the ongoing work to manage cloud costs becomes complex as your organization grows, your infrastructure spans multiple clouds, and you can't easily see the full value of your cloud spending by tracking only the bottom line.

Read Post

Datadog

Read more about How we've created a successful FinOps practice at Datadog

What's New in InfluxDB 3.2: Explorer UI Now GA Plus Key Enhancements

Jun 30, 2025 By Paul Dix In InfluxData

InfluxDB 3.2 is now available for both Core and Enterprise, bringing the general availability of InfluxDB 3 Explorer, a new UI that simplifies how you query, explore, and visualize data. On top of that, 3.2 includes a wide range of performance improvements, feature updates, and bug fixes. InfluxDB 3 Core is free and open source, optimized for recent data, and licensed under MIT and Apache 2.

Read Post

InfluxData

Read more about What's New in InfluxDB 3.2: Explorer UI Now GA Plus Key Enhancements

Operational Intelligence - the new horizon of observability

Jun 30, 2025 By John Hayes In Squared Up

Monitoring your systems isn't enough anymore. Neither is “asking questions about your system”. Operational Intelligence embraces observability to proactively deliver business insights, support decision-making, and accelerate innovation. It seems that as the observability market grows and more and more products come into the space, the meaning of the term observability itself becomes more and more nebulous.

Read Post

Squared Up

Read more about Operational Intelligence - the new horizon of observability

Going beyond AI chat response: How we're building an agentic system to drive Grafana

Jun 30, 2025 By Yasir Ekinci In Grafana

As we look at the role AI can play in Grafana going forward, we want to move beyond the simple chat responses that dominate the world of LLMs today and into agentic systems—AI that can understand, reason, and act on your behalf. The ultimate goal is to make it easy to get things done in Grafana using natural language—whether you’re a seasoned SRE or a new developer. And in the AI world, we call this moving from chat completion to task completion.

Read Post

Grafana

Read more about Going beyond AI chat response: How we're building an agentic system to drive Grafana

Do you Grok It?

Jun 30, 2025 By Mezmo In Mezmo

Most people are probably familiar with the word “grok” from Robert A. Heinlein’s novel A Stranger in a Strange Land, in which it is used to describe a deep, almost mystical understanding of something. ‍ Grok is also the name of a plugin for LogStash that enables you to parse and analyze log data using a syntax similar to regular expressions, but specialized for various log formats and fields.

Read Post

Mezmo

Read more about Do you Grok It?

Run Telegraf, InfluxDB 3 & Grafana in Minutes with Docker

Jun 30, 2025 By InfluxData In InfluxData

Learn how to set up open source TIG Stack.

View Video

InfluxData

Read more about Run Telegraf, InfluxDB 3 & Grafana in Minutes with Docker

Custom Alerts in Checkly

Jun 30, 2025 By Checkly In Checkly

Learn how to customize your alerts in Checkly to get only the notifications you need. This video walks through account-wide alert settings, managing alert channels, using groups for business-critical checks, and leveraging Monitoring as Code to manage everything from your IDE. Plus, see how to use the Checkly CLI to import existing checks from the UI into code for full version control and automation.

View Video

Checkly

Read more about Custom Alerts in Checkly

Optimizing mobile website performance using digital experience monitoring

Jun 29, 2025 By Ramkumar Ramaswamy In Site24x7

Delivering an exceptional mobile user experience (UX) is critical for business success. As mobile devices cause over 60% of global web traffic(2024) from billions of active users, a subpar mobile experience can lead to lost customers and revenue. Slow-loading pages and design-induced poor interactivity and unstable layouts frustrate users. Bad UX drives disgruntled users quickly to competitors via a one-way street.

Read Post

Site24x7

Read more about Optimizing mobile website performance using digital experience monitoring

How to Reduce Application Downtime with APM?

Jun 28, 2025 By Mohana Ayeswariya J In Atatus

According to a recent 2025 study, the average cost of downtime has inched as high as $9,000 per minute for large organizations. For higher-risk enterprises like finance and healthcare, downtime can eclipse $5 million an hour in certain scenarios. Whether you're part of a DevOps team, an SRE, a developer, or an engineering manager, minimizing application downtime should be a critical focus. One of the most effective ways to achieve this is through Application Performance Monitoring (APM).

Read Post

Atatus

Read more about How to Reduce Application Downtime with APM?

Sponsored Post

SAP system refresh automation

Jun 27, 2025 By Avantra Team In Avantra

SAP system refresh automation is extremely powerful when leveraged with care; system refreshes are complex and challenging processes to manage. System refreshes can be fraught with risk for organizations with critical data due to their level of complexity. Mitigating this risk comes down to knowing the benefits of automation and how the processes work. This article will help you: To try out Avantra's SAP automation features, sign up for a free trial.

Read Post

Avantra

Read more about SAP system refresh automation

CXO Focus by ManageEngine ITOM

Jun 27, 2025 By ITOM In ManageEngine

The central source for C-level insights on IT trends and transformation.

Read Post

ManageEngine

Read more about CXO Focus by ManageEngine ITOM

ManageEngine recognized in the 2025 Gartner® Market Guide for Infrastructure Monitoring Tools

Jun 27, 2025 By OpManager Plus In ManageEngine

This recognition highlights our commitment to simple, cost-effective IT monitoring—no frills, just results.

Read Post

ManageEngine

Read more about ManageEngine recognized in the 2025 Gartner® Market Guide for Infrastructure Monitoring Tools

What Causes Packet Loss and How to Fix It

Jun 27, 2025 By Wendy Howard In eG Innovations

Packet loss is likely a familiar issue if you’ve managed networks susceptible to slower network speeds, degraded data quality, or increased latency. It directly impacts network operations and digital experiences, which is why we recommend that you take the time to understand and prevent it.

Read Post

eG Innovations

Read more about What Causes Packet Loss and How to Fix It

What's Slowing Down Your App? Common Performance Issues APM Can Solve

Jun 27, 2025 By Pavithra Parthiban In Atatus

Application performance is critical to user experience and business success. When an application starts slowing down, identifying the root cause isn’t always straightforward. For developers, DevOps engineers, and SREs, Application Performance Monitoring (APM) tools provide real-time visibility into how applications behave under load.

Read Post

Atatus

Read more about What's Slowing Down Your App? Common Performance Issues APM Can Solve

How Dropbox rebuilt its logging stack with Grafana Loki after a data center went dark

Jun 27, 2025 By Colin Steele In Grafana

Two years ago, a power outage knocked a Dropbox data center offline. It wasn’t just any data center. It was the only one where Dropbox hosted Grafana Loki, meaning engineers couldn’t access their log data. “We had considered a data center outage when we were rolling out Loki, but it had just never risen up in priority enough to get put into multiple data centers,” said Chris Hodges, an infrastructure software engineer at the cloud storage company.

Read Post

Grafana

Read more about How Dropbox rebuilt its logging stack with Grafana Loki after a data center went dark

Route your monitor alerts with Datadog monitor notification rules

Jun 27, 2025 By Khang Truong In Datadog

As organizations scale their infrastructure, monitoring systems can become a source of noise rather than insight. A clean, straightforward set of alerts for a handful of services can quickly spiral into a mess of overlapping thresholds, redundant triggers, and inconsequential notifications across hundreds (or thousands) of components. This flood of notifications can slow response times, overwhelm engineers, and increase the chance of overlooking critical problems.

Read Post

Datadog

Read more about Route your monitor alerts with Datadog monitor notification rules

The Open Source Observability Podcast - EP #1: Clickhouse, Data Lakes, and AWS S3 with Joshua Lee

Jun 27, 2025 By Coroot In Coroot

In this episode we get to dive into some of Josh's favourite databases and telemetry sources for observability. Listen to learn what open source software you could benefit from including in your toolstack! Joshua Lee is a Developer Advocate at Altinity, where he applies his observability and engineering background to ClickHouse use cases and creates educational content to support the open source community. He has over 15 years of experience in leading software projects for a broad scope of industries.

View Video

Coroot

Read more about The Open Source Observability Podcast - EP #1: Clickhouse, Data Lakes, and AWS S3 with Joshua Lee

How to Handle the NumberFormatException in Java

Jun 27, 2025 By Rollbar In Rollbar

The NumberFormatException is one of the most common runtime exceptions you'll encounter in Java. It's an unchecked exception that occurs when you try to convert a string to a numeric value, but the string format isn't compatible with the target number type. Simply put, if you attempt to parse "hello" as an integer or "12.5" as an integer, Java throws a NumberFormatException because these strings can't be converted to the expected numeric format.

Read Post

Rollbar

Read more about How to Handle the NumberFormatException in Java

Drive Public Sector Efficiencies of Scale with Splunk and AWS

Jun 27, 2025 By Eduardo Nzambi In Splunk

Today’s public sector organizations are tasked with delivering a staggering amount of technology capabilities to support a growing set of digital services, meet IT modernization goals, and continue to protect against a wide range of attack vectors. Cloud technology adoption has played a significant role in ensuring that ongoing IT modernization not only aligns with each organization’s mission-strategic capabilities but also enables efficiencies of scale.

Read Post

Splunk

Read more about Drive Public Sector Efficiencies of Scale with Splunk and AWS

CPU monitoring for network admins: Why it matters more than ever

Jun 27, 2025 By monicaa.mn@zohocorp.com In ManageEngine

In your role as a network administrator, maintaining smooth, uninterrupted system performance isn’t just a one-time task; it’s your daily mission. Whether you're managing hundreds of endpoints, virtual machines, or hybrid cloud environments, CPU monitoring is one of the most critical tools in your toolkit. Without it, diagnosing performance slowdowns, service lags, or outages becomes reactive guesswork.

Read Post

ManageEngine

Read more about CPU monitoring for network admins: Why it matters more than ever

Event Intelligence Solutions: The Essential Tools Every ITOps Manager Needs - and How Interlink Software Delivers

Jun 27, 2025 By david.arrowsmith In Interlink

david.arrowsmith • June 27, 2025 IT Operations (ITOps) managers need to ensure always-on availability across a more complex and hybrid ecosystem than ever before. Event storms, patchwork toolchains and slow root cause analysis (RCA) impede responsiveness and undermine the high digital performance customers demand. The Event Intelligence and Service Observability Platform from Interlink Software addresses this.

Read Post

Interlink

Read more about Event Intelligence Solutions: The Essential Tools Every ITOps Manager Needs - and How Interlink Software Delivers

Why Splunk Cloud Platform on Azure?

Jun 27, 2025 By Splunk In Splunk

See how the Splunk Cloud Platform’s AI capabilities unlock enterprise-level observability and security applications and by pairing them with Microsoft Azure, you get a powerfully managed software-as-a-service solution.

View Video

Splunk

Read more about Why Splunk Cloud Platform on Azure?

From Detection to Resolution: How Selector + Itential Deliver AI-Driven Observability and Automated Recovery

Jun 27, 2025 By Dallon Robinette In Selector

Every second counts when it comes to detecting, diagnosing, and resolving network incidents, yet many teams still find themselves stuck in reactive mode, drowning in alerts, manually writing scripts, and managing tickets across disconnected systems. This is where Selector and Itential come in. Together, Selector and Itential deliver a powerful, enterprise-ready solution that closes the loop between detection and action.

Read Post

Selector

Read more about From Detection to Resolution: How Selector + Itential Deliver AI-Driven Observability and Automated Recovery

We've added custom timeframes to our dashboards!

Jun 27, 2025 By SquaredUp In Squared Up

We understand that one-size-fits-all timeframes don’t always meet your needs. Our new feature means you get more from your SquaredUp dashboards.

View Video

Squared Up

Read more about We've added custom timeframes to our dashboards!

What's New in Flowmon ADS 12.5?

Jun 27, 2025 By Progress Flowmon In Flowmon

In this webinar, we’ll introduce the new features, including: AI-Powered Threat Briefings – A new dashboard that correlates global threat intelligence with your network’s current and historical data. Enhanced Event Visualization – Dive into a redesigned event detail streamlining user experience Expert-level recommendations - Guided next steps for each detection, helping analysts of all skill levels validate and resolve incidents with confidence.

View Video

Flowmon

Read more about What's New in Flowmon ADS 12.5?

The impact of generative AI with the Splunk Cloud Platform

Jun 27, 2025 By Splunk In Splunk

See how with the Splunk Cloud Platform, you can now use a AI-powered platform to increase productivity and deliver faster detection and response to address IT challenges.

View Video

Splunk

Read more about The impact of generative AI with the Splunk Cloud Platform

Monitoring Behind the Great Firewall

Jun 27, 2025 By Dotcom-Monitor In Dotcom-Monitor

As Site Reliability Engineers (SREs) managing global infrastructure, we face unique challenges when serving users in mainland China. The Great Firewall of China creates a complex web of technical obstacles that can render even the most robust international websites slow, unreliable, or completely inaccessible to Chinese users.

Read Post

Dotcom-Monitor

Read more about Monitoring Behind the Great Firewall

Can AI/ML Guide Observability? Tech Talk #6

Jun 27, 2025 By VictoriaMetrics In VictoriaMetrics

This talk will examine the application of Artificial Intelligence and Machine Learning in observability. It will cover how AI/ML is being used to monitor systems, detect anomalies, and extract insights from telemetry data. The session will provide information on integrating AI/ML into observability pipelines, improving analytical capabilities, and system performance.

View Video

VictoriaMetrics

Read more about Can AI/ML Guide Observability? Tech Talk #6

Nexthink Achieves FedRAMP "In Process" Designation

Jun 27, 2025 By Nexthink In Nexthink

We are proud to announce a significant advancement in our commitment to serving the US federal market – Nexthink is now listed as “In Process” in the FedRAMP marketplace. To achieve this, we have been working closely with our federal consultant Quzara, to complete a rigorous security assessment. Through this process, we're implementing hundreds of required controls to meet the highest standards of cloud security.

Read Post

Nexthink

Read more about Nexthink Achieves FedRAMP "In Process" Designation

F5 Monitoring on Microsoft SCOM

Jun 26, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

As part of a recent customer project, we developed a custom F5 Management Pack for Microsoft System Center Operations Manager (SCOM). This bespoke solution enables IT operations teams to monitor the performance, availability, and health of F5 infrastructure directly within the SCOM environment. It provides deep visibility into key metrics, helping ensure application delivery remains stable, secure, and efficient.

Read Post

NiCE IT Mgmt

Read more about F5 Monitoring on Microsoft SCOM

cplace Monitoring on Microsoft SCOM

Jun 26, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

As part of a recent customer project, we developed a custom cplace Management Pack for Microsoft System Center Operations Manager (SCOM). This tailored solution enables IT operations teams to monitor the performance, availability, and health of cplace environments directly within the SCOM framework.

Read Post

NiCE IT Mgmt

Read more about cplace Monitoring on Microsoft SCOM

Hidden Value in Sumo Logic: What Customers Often Miss -- Customer Brown Bag -- June 26th, 2025

Jun 26, 2025 By Sumo Logic, Inc. In Sumo Logic

Join us as Andy Makings reveals 12 powerful tips and tricks that many users overlook in Sumo Logic. These practical insights can streamline your daily workflows and unlock deeper, more actionable intelligence from your data.

View Video

Sumo Logic

Read more about Hidden Value in Sumo Logic: What Customers Often Miss -- Customer Brown Bag -- June 26th, 2025

Improve SLO accuracy and performance with Datadog Synthetic Monitoring

Jun 26, 2025 By Addie Beach In Datadog

SLOs are key for improving user satisfaction, prioritizing engineering projects, and measuring overall performance. Given the important role that SLOs play in determining organizational benchmarks, teams need to ensure that SLO metrics—also called service level indicators (SLIs)—are reported accurately and maintained consistently within an acceptable range.

Read Post

Datadog

Read more about Improve SLO accuracy and performance with Datadog Synthetic Monitoring

Troubleshooting: No data or monitor not created for .NET applications in Site24x7 APM Insight

Jun 26, 2025 By ManageEngine Site24x7 In Site24x7

Are your.NET applications not showing up in Site24x7 APM Insight? This step-by-step video will help you troubleshoot missing data or monitor issues for both IIS-hosted applications and Windows Services. In this video, you'll learn how to: Related links.

View Video

Site24x7

Read more about Troubleshooting: No data or monitor not created for .NET applications in Site24x7 APM Insight

How to detect vulnerable GitHub Actions at scale with Zizmor

Jun 26, 2025 By James Crocker In Grafana

As we previously reported on April 26, 2025, we had a security incident via an insecure GitHub Action and we have since published a post-incident review. We have confirmed that there has been no code modification, unauthorized access to production systems, exposure of customer data, or access to personal information.

Read Post

Grafana

Read more about How to detect vulnerable GitHub Actions at scale with Zizmor

How to Use an AI Assistant with Your Monitoring System - VictoriaMetrics MCP Server

Jun 26, 2025 By VictoriaMetrics In VictoriaMetrics

Alex Marshalov explores the new VictoriaMetrics MCP Server. He moves beyond the hype to show what's truly possible today. The presentation offers a builder's perspective on integrating AI with time-series data, featuring a demo that showcases both the potential and the current realities (yes, there are some). See how we're thinking about solving complex monitoring challenges with AI. Resources for Further Learning.

View Video

VictoriaMetrics

Read more about How to Use an AI Assistant with Your Monitoring System - VictoriaMetrics MCP Server

Next Level - Infrastructure Monitoring and Load Balancing

Jun 26, 2025 By Progress WhatsUp Gold In WhatsUp Gold

Next Level - Infrastructure Monitoring and Load Balancing; Are you getting the most out of these solutions? Modern network infrastructures are complex and yours is no different. As such, a load balancing solution is required to keep the servers up-and-running for both your customers and employees. The same goes for network traffic monitoring and analysis to understand and comprehend user behavior.

View Video

WhatsUp Gold

Read more about Next Level - Infrastructure Monitoring and Load Balancing

Prometheus and CloudWatch Integration for AWS Metric Collection

Jun 26, 2025 By Anjali Udasi In Last9

The Prometheus CloudWatch exporter pulls AWS CloudWatch metrics into your Prometheus setup, giving you a unified view of your infrastructure alongside application metrics. If you're already running Prometheus and need visibility into AWS services like EC2, RDS, or Lambda, this exporter handles the integration without forcing you to switch monitoring stacks.

Read Post

Last9

Read more about Prometheus and CloudWatch Integration for AWS Metric Collection

Elastic Cloud Serverless now generally available on Microsoft Azure

Jun 26, 2025 By Yuvi Gupta In Elastic

Elastic Cloud Serverless provides the fastest way to start and scale security, observability, and search solutions — without managing infrastructure. Today, we are excited to announce the general availability of Elastic Cloud Serverless on Microsoft Azure — now available in the EastUS region. Elastic Cloud Serverless provides the fastest way to start and scale security, observability, and search solutions without managing infrastructure.

Read Post

Elastic

Read more about Elastic Cloud Serverless now generally available on Microsoft Azure

Jose shares what is new in VictoriaMetrics Cloud as of June 2025.

Jun 26, 2025 By VictoriaMetrics In VictoriaMetrics

Curious as to what is new with VictoriaMetrics Cloud? In this video Jose covers: 1- Access Tokens - tenancy improvements 2- MCP Server integration 3- Direct code integration. New API lib 4- Lots of new integrations out of the box Resources for Further Learning.

View Video

VictoriaMetrics

Monitoring

Read more about Jose shares what is new in VictoriaMetrics Cloud as of June 2025.

Fabric Interconnect: Connecting Servers with UCS Hardware

Jun 26, 2025 By Jordan Rothstein In eG Innovations

Every IT decision-maker faces a common challenge: balancing operational efficiency with cost control. While software solutions help streamline operations and drive efficiency, they can introduce redundancies into your system. These redundancies strengthen availability through backup systems but often complicate data management, leading to inconsistencies and potential outages. This is where hardware solutions like fabric interconnects prove invaluable.

Read Post

eG Innovations

Read more about Fabric Interconnect: Connecting Servers with UCS Hardware

Your integrated MSP platform is here - announcing the launch of MSP Central!

Jun 26, 2025 By Kamya Swaminathan In ManageEngine

For MSPs, the daily juggling act of managing multiple tools can be a drain on resources, efficiency, and, ultimately, your bottom line. You need comprehensive visibility, streamlined workflows, and the ability to proactively address client needs—all without the headache of disparate systems. That's why we're excited to announce the launch of MSP Central, your comprehensive, unified platform for streamlined MSP business management.

Read Post

ManageEngine

Read more about Your integrated MSP platform is here - announcing the launch of MSP Central!

Logz.io Adds PrivateLink Support, Introduces the Parsing Rules Hub, and Significantly Enhances Parsing Capabilities

Jun 26, 2025 By Jade Lassery In logz.io

Today, we’re excited to announce support for AWS PrivateLink, allowing Logz.io customers to securely send logs and metrics through private VPC connectivity, without any data ever hitting the public internet. If you’re running workloads inside a VPC on AWS, this upgrade drastically improves your security posture, simplifies your networking architecture, and – most notably – reduces your data transfer costs (a lot).

Read Post

logz.io

Read more about Logz.io Adds PrivateLink Support, Introduces the Parsing Rules Hub, and Significantly Enhances Parsing Capabilities

Microservices to Monolith, Rebuilding Our Backend in Rust

Jun 26, 2025 By Logan Cox In InfluxData

The following serves as a practical guide for those looking to simplify their architecture by migrating to a Rust monolith. Earlier this year, the platform team at InfluxData undertook a major rewrite of our core account and resource management APIs, moving from Go to Rust and from a microservices architecture to a single monolith. This change supported a new administrative UI for InfluxDB Cloud Dedicated and aligns with our broader effort to rewrite the InfluxDB database engine in Rust.

Read Post

InfluxData

Read more about Microservices to Monolith, Rebuilding Our Backend in Rust

Can AI/ML Guide Observability? Tech Talk #6

Jun 26, 2025 By VictoriaMetrics In VictoriaMetrics

View Video

VictoriaMetrics

Read more about Can AI/ML Guide Observability? Tech Talk #6

Elastic's journey to build Elastic Cloud Serverless

Jun 26, 2025 By Elastic Cloud Serverless team In Elastic

Stateless architecture that auto-scales no matter your data, usage, and performance needs How do you take a stateful, performance-critical system like Elasticsearch and make it serverless? At Elastic, we reimagined everything — from storage to orchestration — to build a truly serverless platform that customers can trust. Elastic Cloud Serverless is a fully managed, cloud-native platform designed to bring the power of Elastic Stack to developers without the operational burden.

Read Post

Elastic

Read more about Elastic's journey to build Elastic Cloud Serverless

Introducing AI Agent Monitoring

Jun 26, 2025 By Sasha Blumenfeld In Sentry

AI is changing how we build software — but debugging code still comes down to having context. One minute the model’s performance is cruising. The next, you’re hit with a KeyError from a tool you forgot existed, triggered by a model that silently timed out, and a retrieval call that returns... nothing, or 11 “Let me try this a different way" messages before failure. You’re stitching together LLM calls, agents, vector stores, and custom logic. Then hoping it holds up in prod.

Read Post

Sentry

Read more about Introducing AI Agent Monitoring

Achieving Full Visibility: Modern Monitoring for Distributed Cloud Applications

Jun 26, 2025 By Catchpoint In Catchpoint

Today’s applications are hybrid, cloud-centric, service-oriented, API-dependent, and geographically distributed. The monitoring practices we relied on for decades are no longer sufficient. It is critical to monitor all the internet-centric dependencies, connectivity, and cloud application components – and to do so from the user’s perspective so IT operations teams can achieve digital resilience and deliver performance. This session will cover DEM, APM, and IPM and how they can work together to pinpoint issues before they occur, so users receive a great digital experience.

View Video

Catchpoint

Read more about Achieving Full Visibility: Modern Monitoring for Distributed Cloud Applications

Introducing AI Agent Monitoring in Sentry

Jun 26, 2025 By Sentry In Sentry

Monitoring agents and LLM applications is... different. Managing everything from tool calls, to model configurations, token usage, and AI systems do their best to solve problems on their own - so errors aren't always clear. Sentry's agent monitoring focuses on making it easy to dive into your AI applications and understand whats breaking, where, so you can fix it faster.

View Video

Sentry

Read more about Introducing AI Agent Monitoring in Sentry

Instant Open Source Observability: the magic of eBPF

Jun 26, 2025 By Coroot In Coroot

Alex chats a little with Josh Lee, a Developer Advocate at Altinity with over 15 years experience in software development and management, about why he loves using Coroot.

View Video

Coroot

Read more about Instant Open Source Observability: the magic of eBPF

Data nightmare: Majority of UK IT decision makers fear exponential rise of data within their businesses

Jun 25, 2025 By Splunk In Splunk

As demand for data-driven insights increases, data overload could render UK businesses operationally ineffective.

Read Post

Splunk

Read more about Data nightmare: Majority of UK IT decision makers fear exponential rise of data within their businesses

How Sentry's Seer AI Agent passes legal review: a guide for legal teams reviewing Seer

Jun 25, 2025 By Virginia Badenhope In Sentry

If your legal department is anything like ours, you’re being inundated with requests from the business to use more and more AI tools. Whether it's developers wanting to use coding agents like Cursor, to security implementing AI-driven investigations, to sales and marketing leveraging AI for call insights and competitive research, we've seen a shift in what teams are trying and buying.

Read Post

Sentry

Read more about How Sentry's Seer AI Agent passes legal review: a guide for legal teams reviewing Seer

Now you can use Sentry Insights to trigger alerts and debug issues

Jun 25, 2025 By Ben Coe In Sentry

You deploy a fix late Friday and spend the weekend refreshing dashboards, hoping nothing breaks. You shouldn’t have to babysit a dashboard to know when something’s wrong. With the latest updates to Insights, you can now create alerts directly from any chart. Whether it’s a spike in 4xx errors after a deploy, a jump in P95 latency for an API endpoint, or a drop in throughput for a background job, you can set up alerts with just two clicks.

Read Post

Sentry

Read more about Now you can use Sentry Insights to trigger alerts and debug issues

Understanding APM and Distributed Tracing in the Observability Stack

Jun 25, 2025 By Pavithra Parthiban In Atatus

To keep modern applications running smoothly, you need more than just basic monitoring. APM (Application Performance Monitoring) gives you a broad overview, tracking metrics like latency, errors, and system health. Distributed Tracing, on the other hand, shows the full journey of each request across services, helping you pinpoint the root cause of slowdowns or failures.

Read Post

Atatus

Read more about Understanding APM and Distributed Tracing in the Observability Stack

Grafana Cloud updates: The latest features in Kubernetes Monitoring, Fleet Management, and more

Jun 25, 2025 By Kristin Knapp In Grafana

We consistently roll out helpful updates and fun features in Grafana Cloud, our fully managed observability platform powered by the open source Grafana LGTM Stack ( Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics). In case you missed them, here’s our monthly round-up of the latest and greatest Grafana Cloud updates.

Read Post

Grafana

Read more about Grafana Cloud updates: The latest features in Kubernetes Monitoring, Fleet Management, and more

Trace Distributed Map states for AWS Step Functions with Datadog

Jun 25, 2025 By Abhinav Vedmala In Datadog

AWS Step Functions offers the Distributed Map state, enabling you to coordinate massively parallel workloads within your serverless applications. With this feature, a single Step Functions execution can fan out into up to 10,000 parallel workflows simultaneously, making it possible to efficiently process millions of items in parallel. This capability unlocks new possibilities for large-scale data processing, such as image transformation, log ingestion, or batch analytics.

Read Post

Datadog

Read more about Trace Distributed Map states for AWS Step Functions with Datadog

What is log tagging and how to configure it in Site24x7

Jun 25, 2025 By ManageEngine Site24x7 In Site24x7

In this video, learn what is Site24x7's log tag and how to configure, categorize, filter, and monitor your logs more effectively—so you can create your custom log tag that gives you full visibility into your logs or categorize them even better. Here’s what you’ll learn: Whether you're an IT personnel, DevOps engineer, or security analyst, this video will help you make smarter tags for monitoring decisions.

View Video

Site24x7

Read more about What is log tagging and how to configure it in Site24x7

Infrastructure monitoring with Site24x7 | Cloud, Kubernetes, and Hybrid Environments

Jun 25, 2025 By ManageEngine Site24x7 In Site24x7

Modern IT environments are dynamic, distributed, and constantly evolving. You need more than traditional monitoring to keep everything running smoothly. Site24x7 is your all-in-one, AI-powered infrastructure monitoring solution. What this video covers: Whether you're overseeing AWS, Azure, GCP, OCI, VMware, or Kubernetes, Site24x7 simplifies it all with a single agent and AI-driven insights.

View Video

Site24x7

Read more about Infrastructure monitoring with Site24x7 | Cloud, Kubernetes, and Hybrid Environments

Honeycomb: Basics to Business Value

Jun 25, 2025 By Honeycomb In Honeycomb

Send Honeycomb JSON or sophisticated OpenTelemetry. Traces, logs, metrics. Then add a few business-specific fields (or calculate them), and answer important questions on the fly.

View Video

Honeycomb

Read more about Honeycomb: Basics to Business Value

The Road to Loki 4.0 (Loki Community Call June 2025)

Jun 25, 2025 By Grafana In Grafana

In this Loki Community Call, we welcome back Ed Welch, Principal Engineer on the Loki team. We will be discussing with Ed what is next for Loki as we push forward to Loki 4.0. If you are interested, learn more about potential architecture changes, storage formats, and an open discussion on where Ed and the Loki team would like to see the future of Loki, then make sure you join us live and have your questions answered!

View Video

Grafana

Read more about The Road to Loki 4.0 (Loki Community Call June 2025)

Observability Across Asia-Pacific: What's Holding Teams Back? | 2025 Observability Survey Analysis

Jun 25, 2025 By Grafana In Grafana

What’s holding back observability maturity in Asia-Pacific? Grafana Labs' cofounder Anthony Woods shares key takeaways from the largest global observability survey. Learn how SaaS, budget concerns, and org structure are shaping Asia-Pacific (APAC)'s future. Grafana Cloud is the easiest way to get started with Grafana dashboards, metrics, logs, and traces. Our forever-free tier includes access to 10k metrics, 50GB logs, 50GB traces and more.

View Video

Grafana

Read more about Observability Across Asia-Pacific: What's Holding Teams Back? | 2025 Observability Survey Analysis

What Is Session Replay and How It Improves User Experience in IT Environments

Jun 25, 2025 By Isaac García In Pandora FMS

Anyone who works in technology quickly learns this truth: users will always interact with systems in the most unexpected and baffling ways… and when something goes wrong, they swear they “didn’t touch anything.” There’s a vast ocean between how something is designed and how it’s actually used—an ocean filled with bugs waiting to be caught. But there’s a way to bridge that gap: session replay.

Read Post

Pandora FMS

Read more about What Is Session Replay and How It Improves User Experience in IT Environments

How to Reduce IT Costs on Hardware Refresh Cycles

Jun 25, 2025 By Nexthink In Nexthink

IT budgets are under pressure, and hardware refresh costs continue to climb. For End User Computing (EUC) and IT professionals, the traditional time-based approach to managing device lifecycles is no longer viable. Simply replacing laptops and desktops every three to five years doesn’t reflect actual device performance, usage patterns, or business needs. The solution? A smarter, data-driven hardware refresh strategy that balances performance, cost-efficiency, and employee experience.

Read Post

Nexthink

Read more about How to Reduce IT Costs on Hardware Refresh Cycles

Introducing Cause Analysis: Instant Triage for Traffic Changes with Kentik AI

Jun 25, 2025 By Eric Hian-Cheong In Kentik

Introducing Cause Analysis from Kentik, designed to simplify network traffic analysis and rapidly identify the root cause of issues. Learn how this exciting new feature streamlines troubleshooting, makes complex insights accessible, and boosts team efficiency for all users.

Read Post

Kentik

Read more about Introducing Cause Analysis: Instant Triage for Traffic Changes with Kentik AI

Amazon SQS Metrics: Monitor, Debug, and Optimize Your Message Queues

Jun 25, 2025 By Anjali Udasi In Last9

Message queues quietly take care of a lot—buffering workloads, smoothing traffic spikes, and keeping services connected. But they don’t always get much attention until something feels off. Amazon SQS offers a solid set of metrics to help you understand how your queues are doing, whether you’re scaling well or nearing limits. This blog breaks down the key SQS metrics: where to find them, what they mean, and how to respond when things start to shift.

Read Post

Last9

Read more about Amazon SQS Metrics: Monitor, Debug, and Optimize Your Message Queues

Deploying a WhatsUp Gold 360 Connector for Hyper-V

Jun 25, 2025 By Progress WhatsUp Gold In WhatsUp Gold

WhatsUp Gold 360 provides real-time insights into your internet connectivity to your remote sites through the use of connectors. Watch this video to learn how to create and deploy a WhatsUp Gold 360 connector to a Hyper-V environment.

View Video

WhatsUp Gold

Read more about Deploying a WhatsUp Gold 360 Connector for Hyper-V

How to Configure Docker's Shared Memory Size (/dev/shm)

Jun 25, 2025 By Faiz Shaikh In Last9

Your Node.js app runs fine on your machine. But inside Docker? You start getting weird crashes—ENOSPC: no space left on device. Chrome headless tests fail out of nowhere. PostgreSQL throws shared memory errors under load. The problem? It’s probably /dev/shm, the shared memory volume Docker sets up by default. Most containers get just 64MB of space here.

Read Post

Last9

Read more about How to Configure Docker's Shared Memory Size (/dev/shm)

How to Create a Free Status Page in Under 5 Minutes

Jun 25, 2025 By Leo Baecker In Hyperping

Your website goes down at 2 AM. Your customers wake up to broken services, flooded support inboxes, and zero communication from your team. By the time you're awake and fixing things, trust is already damaged. A status page prevents this nightmare scenario. But here's the thing — most teams keep putting it off because they think it's complicated, expensive, or time-consuming. It's not. You can create a professional status page in under 5 minutes, completely free. I'll show you exactly how.

Read Post

Hyperping

Read more about How to Create a Free Status Page in Under 5 Minutes

How One MSP Used AI to Cut Noise by 78% and Reclaim Engineering Time

Jun 25, 2025 By LogicMonitor In LogicMonitor

An operations team at one of the Asia-Pacific’s largest managed service providers (MSPs) was drowning in their own success. Years of investment in monitoring tools and automation had created comprehensive visibility—and comprehensive chaos. Engineers opened dashboards each morning to find thousands of alerts waiting, with critical incidents buried somewhere inside. The scale of the problem was overwhelming their capacity to respond effectively.

Read Post

LogicMonitor

Read more about How One MSP Used AI to Cut Noise by 78% and Reclaim Engineering Time

How To Implement Commit Tracing For ArgoCD Setups

Jun 25, 2025 By Sematext In Sematext

How do you use ArgoCD and a GitHub workflow together with an external CI/CD tool (Jenkins in our case) to trace a specific PR/commit hash code in a GitHub repository that’s external to ArgoCD? The term ‘commit tracing’ comes from Michael Crenshaw – one of the lead developers of Argo.

Read Post

Sematext

Read more about How To Implement Commit Tracing For ArgoCD Setups

Why synthetic testing is the secret to proactive Teams management

Jun 25, 2025 By Sara Purdon In Martello Technologies

The more organizations depend on collaboration solutions like Microsoft Teams for productivity, the more IT departments are expected to ensure a seamless experience every time. That demands more than just rapid troubleshooting when issues occur: it requires IT teams to get ahead of problems and keep them from affecting users in the first place. For that, synthetic testing is a must.

Read Post

Martello Technologies

Read more about Why synthetic testing is the secret to proactive Teams management

See more, solve more with end-to-end network path tracing

Jun 25, 2025 By Sara Purdon In Martello Technologies

Few things hold IT teams back more than a lack of visibility. It’s exponentially harder to solve issues when they originate in parts of the environment you can’t see. That’s one of the big limitations of native tools for monitoring and managing Microsoft Teams. Microsoft Call Quality Dashboard, Admin Center and Service Dashboard, and Meeting Room Pro Dashboard are all constrained to the aspects of Teams that Microsoft controls directly.

Read Post

Martello Technologies

Read more about See more, solve more with end-to-end network path tracing

Resilience through Chaos Integrating IPM into your Observable SDLC OSDLC

Jun 25, 2025 By Catchpoint In Catchpoint

The idea of deliberately introducing failures into your systems to test for weaknesses—what’s known as chaos engineering—can be intimidating. But what if you could evaluate resilience without actually breaking anything?

View Video

Catchpoint

Read more about Resilience through Chaos Integrating IPM into your Observable SDLC OSDLC

Fixing issues faster with anomaly detection

Jun 25, 2025 By Sentry In Sentry

With application performance monitoring and statistics, Sentry.io predicts expected values for any metrics query—and alerts you when your app’s performance falls outside the norm.

View Video

Sentry

Read more about Fixing issues faster with anomaly detection

Top 10 Network Monitoring Tools to Boost Your IT Performance

Jun 25, 2025 By OpsMatters In OpsMatters

In today's digital scene, a strong and secure network forms the foundation of any organization. When networks go down, face performance issues, or encounter security risks, companies can suffer big money losses and damage to their reputation. IT teams need network monitoring tools to stay on top of performance, spot problems, and keep things running. As AI, cloud-based answers, and automation get better, 2025 brings a bunch of powerful tools to make your IT setup work better.

Read Post

OpsMatters

Read more about Top 10 Network Monitoring Tools to Boost Your IT Performance

Why do hotel rooms have smoke detectors in every room, not just one on every floor?

Jun 24, 2025 By Catchpoint In Catchpoint

Early detection matters. When a problem occurs, you want to know immediately, not after the damage is done. Monitoring isn’t just about visibility; it’s about precision, speed, and proximity to the problem. Just like smoke detectors, you need to monitor in the right places: close to your critical infrastructure, applications, and end users. The sooner you detect issues, the cheaper and easier they are to fix. And that’s where real resilience begins.

View Video

Catchpoint

Monitoring

Read more about Why do hotel rooms have smoke detectors in every room, not just one on every floor?

Fireside Chat: Observability Lessons and Practices from a Fortune 500 Leader

Jun 24, 2025 By Catchpoint In Catchpoint

Join SAP CX's Martin Norato Auer, VP of Observability, and Catchpoint’s Nick Homan as we explore SAP CX’s journey from fragmented alert management to a scalable, standardized observability model. In this candid fireside chat, Martin shares how his team overcame alert fatigue, integrated observability with automation and BI, and scaled their practices across multiple SAP CX products with APM & Internet Performance Monitoring (IPM).

View Video

Catchpoint

Read more about Fireside Chat: Observability Lessons and Practices from a Fortune 500 Leader

Observability Without Tradeoffs: Introducing Powerful New Honeycomb Telemetry Pipeline Features

Jun 24, 2025 By Elsie Phillips In Honeycomb

Every day, enterprise companies generate terabytes of observability data while engineering teams are under pressure to cut costs. One of the easiest ways to reduce observability bills is through sampling: intentionally sending only a representative portion of telemetry data, rather than the full volume, to your observability tool. But turning down the dial is risky.

Read Post

Honeycomb

Read more about Observability Without Tradeoffs: Introducing Powerful New Honeycomb Telemetry Pipeline Features

11 Best Log Monitoring Tools for Developers in 2025

Jun 24, 2025 By Anjali Udasi In Last9

Your checkout API just started throwing 500s during peak traffic. You SSH into production, tail logs across six microservices, and realize the database timeout buried in service's logs is causing cascade failures. Two hours later, you've fixed it, but you're thinking: "There has to be a better way." There is. Log monitoring tools centralize logs from your entire stack, making debugging systematic instead of archaeological.

Read Post

Last9

Read more about 11 Best Log Monitoring Tools for Developers in 2025

Grafana Cloud: Manage the AWS Observability app as code with Terraform

Jun 24, 2025 By Ana Ivanov In Grafana

Imagine setting up your AWS configuration in Grafana Cloud by hand and clicking through menus. When you only have a few services, it’s not a big deal. But as you add more and more, keeping track of every little change becomes a headache. It’s easy to make mistakes, and before you know it, things can get out of sync and your monitoring becomes unreliable.

Read Post

Grafana

Read more about Grafana Cloud: Manage the AWS Observability app as code with Terraform

How Cursor scaled infrastructure rapidly and reliably using Datadog

Jun 24, 2025 By Datadog In Datadog

At Datadog, we use Cursor to empower our teams to build more quickly. And we know that building and troubleshooting with AI tools like Cursor is done best with the right observability data and context. Discover how Cursor was able to rapidly and reliably scale their infrastructure 100x using Datadog to meet the needs of a fast growing user base. And learn more about how we’re bring Datadog tools and context to your favorite AI IDEs and agents with our MCP Server and extensions.

View Video

Datadog

Read more about How Cursor scaled infrastructure rapidly and reliably using Datadog

How to fix high CPU temperature: A network admin's checklist

Jun 24, 2025 By monicaa.mn@zohocorp.com In ManageEngine

It’s 2 AM. Your phone buzzes. A critical server’s CPU is maxing out again. But this time, the issue isn’t just high usage. It’s heat. As a network admin, you’re trained to monitor traffic patterns, patch vulnerabilities, and respond to performance slowdowns. But high CPU temperature? That’s the silent system killer many still underestimate. Without a proactive plan, it can knock out performance, rack up hardware costs, and shorten the lifespan of your infrastructure.

Read Post

ManageEngine

Read more about How to fix high CPU temperature: A network admin's checklist

Data Center Ops with InfluxDB 3: From Raw Metrics to Actionable Insights with Ease

Jun 24, 2025 By Suyash Joshi In InfluxData

Modern data centers generate enormous volumes of telemetry from servers, switches, cooling systems, power infrastructure, and environmental sensors. Operations engineers must capture, store, and analyze this data in real-time to monitor uptime, maintain energy efficiency, and perform predictive maintenance using AI. Legacy monitoring systems struggle to meet today’s volume, cardinality, and latency demands.

Read Post

InfluxData

Read more about Data Center Ops with InfluxDB 3: From Raw Metrics to Actionable Insights with Ease

AI Test Generation and PR Review in Sentry (Now in Open Beta)

Jun 24, 2025 By Lindsay Piper In Sentry

You write code. Open a PR. CI runs. PR merges. Prod’s on fire by 5pm. Maybe you skipped writing some tests. (It's tedious, sometimes unclear, and easy to ignore when you're racing to ship—until something breaks and you realize a test could’ve saved your Friday night.) Maybe the PR review was more of a drive-by from a teammate who barely had time to skim the diff. But reviews and tests matter.

Read Post

Sentry

Read more about AI Test Generation and PR Review in Sentry (Now in Open Beta)

Beyond Observability: Why Network Intelligence Will Make Traditional Network Management Obsolete

Jun 24, 2025 By Avi Freedman In Kentik

Kentik CEO and co-founder Avi Freedman explains why observability is not enough in the age of AI.

Read Post

Kentik

Read more about Beyond Observability: Why Network Intelligence Will Make Traditional Network Management Obsolete

How to connect your AWS account with Site24x7 using IAM role | Step-by-step tutorial

Jun 24, 2025 By ManageEngine Site24x7 In Site24x7

In this step-by-step tutorial, learn how to securely connect your AWS account with Site24x7 using IAM role-based cross-account access. We’ll guide you through: This method ensures secure, read-only access to your cloud environment while enabling real-time monitoring and alerting via Site24x7.

View Video

Site24x7

Read more about How to connect your AWS account with Site24x7 using IAM role | Step-by-step tutorial

How To Perform A TCP Check | Grafana Synthetic Monitoring

Jun 24, 2025 By Grafana In Grafana

Learn how to set up TCP checks using Grafana Cloud Synthetic Monitoring. In this video, we walk through how to create a TCP check and analyze test results.

View Video

Grafana

Read more about How To Perform A TCP Check | Grafana Synthetic Monitoring

FIPS 140-3 Compatible Builds for VictoriaMetrics Enterprise Components

Jun 24, 2025 By Artem Navoiev In VictoriaMetrics

VictoriaMetrics introduces FIPS 140-3 compatible builds for its components, starting with version 1.117.0. These builds utilize Google’s FIPS 140-3 validated BoringCrypto module. This is critical for customers in regulated environments (federal government, finance, healthcare) to meet FIPS 140-3 cryptographic requirements for data encryption, TLS, and secure communications.

Read Post

VictoriaMetrics

Read more about FIPS 140-3 Compatible Builds for VictoriaMetrics Enterprise Components

Observability 2.0: Seeing More, Knowing More, Fixing More

Jun 24, 2025 By ScienceLogic In ScienceLogic

The era of scattered monitoring tools and fragmented operational visibility is over. As hybrid and multi-cloud environments have become the norm rather than the exception, traditional observability approaches—siloed metrics, isolated logs, and disconnected traces—can no longer keep pace with the complexity of modern IT infrastructure. Organizations today need more than just monitoring.

Read Post

ScienceLogic

Read more about Observability 2.0: Seeing More, Knowing More, Fixing More

Naming your kernel objects

Jun 24, 2025 By Percepio In Percepio

When using Percepio TraceRecorder, kernel objects like queues, semaphores and mutexes are named using their address by default. This can be a bit hard to follow for complex traces. However, it is quite easy to set more descriptive custom names for your RTOS kernel objects. This by calling the “SetName” functions (or macros) found in the TraceRecorder API, for example: The first argument is the pointer to the object (i.e. the object address).

Read Post

Percepio

Read more about Naming your kernel objects

Verify Redirect URL in Uptime Checks

Jun 24, 2025 By Sean White In Oh Dear

We’ve just rolled out a helpful new setting to give you tighter control over how uptime is measured: Redirect URL Validation.

Read Post

Oh Dear

Read more about Verify Redirect URL in Uptime Checks

Are IT Certs Really Worth It? Here's the Truth

Jun 24, 2025 By solarwindsinc In SolarWinds

Thinking about getting IT or cybersecurity certifications? Here's why they actually matter — and how networking at places like DEF CON can lead to real career opportunities. Whether you're in desktop support or breaking into security, this one's for you.

View Video

SolarWinds

Read more about Are IT Certs Really Worth It? Here's the Truth

Configuring File Integrity Monitoring in SolarWinds Security Event Manager

Jun 24, 2025 By solarwindsinc In SolarWinds

Learn how to configure File Integrity Monitoring (FIM) capabilities in SolarWinds Security Event Manager (formerly Log & Event Manager).

View Video

SolarWinds

Read more about Configuring File Integrity Monitoring in SolarWinds Security Event Manager

SolarWinds Security Event Manager Overview

Jun 24, 2025 By solarwindsinc In SolarWinds

This overview video provides high-level awareness of the capabilities of SolarWinds Security Event Manager (formerly Log & Event Manager). Use the SIEM tool to detect threats, quickly respond to cyber incidents, and report compliance from a consolidated interface.

View Video

SolarWinds

Read more about SolarWinds Security Event Manager Overview

Interacting With Log Data in Security Event Manager

Jun 24, 2025 By solarwindsinc In SolarWinds

SolarWinds Security Event Manager is designed to give users a centralized view of logs and events occurring across their network, and quickly and easily recall specific logs and identify suspicious patterns and behaviors in that data. This video gives a quick overview of the features in SEM, making it easy for users to view and interact with their log data.

View Video

SolarWinds

Read more about Interacting With Log Data in Security Event Manager

Creating Correlation Rules with Security Event Manager

Jun 24, 2025 By solarwindsinc In SolarWinds

Learn how to create correlation rules with Security Events Manager (formerly Log & Event Manager).

View Video

SolarWinds

Read more about Creating Correlation Rules with Security Event Manager

A Guide to Effective Network Load Testing & Load Balancing

Jun 24, 2025 By Alyssa Lamberti In Obkio

When it comes to network management, there are two challenges that are ever-present; ensuring optimal network performance and maintaining uninterrupted network connectivity. Network admins are the unsung heroes, diligently managing the digital highways that connect the modern world. To maintain the delicate balance between seamless user experience and network reliability, two crucial practices come to the forefront: Network Load Testing and Load Balancing.

Read Post

Obkio

Read more about A Guide to Effective Network Load Testing & Load Balancing

Stay Compliant: Meet Your Audit Needs with Datadog!

Jun 24, 2025 By Datadog In Datadog

Datadog's internal compliance team has built audit workflows and control monitoring capabilities using the Datadog platform. We actively use these capabilities to scale our audit programs and comply with multiple compliance frameworks. This session will go into the details of how we addressed our compliance use-cases using the Datadog platform and how our customers can get started.

View Video

Datadog

Read more about Stay Compliant: Meet Your Audit Needs with Datadog!

Introducing ZTB - Defining Zero Trust for Bring Your Own Cloud (BYOC)

Jun 24, 2025 By Pavankalyan Chiluka In SigNoz

Isn’t the "Bring Your Own Cloud" (BYOC) model the latest hot topic in the evolution of cloud-native architecture, especially for companies offering cloud-hosted platforms that must be deployed in the customer’s cloud for privacy, control, or compliance reasons? Over the past few weeks, we have been rigorously researching and discussing how to build a secure BYOC model.

Read Post

SigNoz

Read more about Introducing ZTB - Defining Zero Trust for Bring Your Own Cloud (BYOC)

Zero-effort alert migration from Prometheus to Coralogix

Jun 24, 2025 By Martin McLarnon, Dev Relations In Coralogix

Having spent two decades in technical leadership, I’ve seen first hand what separates great development teams from merely good ones. It’s not about the number of features shipped or the elegance of the codebase — it’s about the ability to consistently deliver value to the customer through really great user experience.

Read Post

Coralogix

Read more about Zero-effort alert migration from Prometheus to Coralogix

Escalating risk, shrinking margins: The 2025 Internet Resilience Report

Jun 24, 2025 By Leo Vasiliou In Catchpoint

When we first launched Catchpoint’s Internet Resilience Report back in 2024, we were already seeing troubling cracks in the digital foundations of major businesses. Remember the CrowdStrike outage? Fast-forward to this year, and it's clear the stakes have only gotten higher. Google Cloud’s recent outage is yet another reminder of how tightly interwoven the Internet is and how all it takes is for one major player to go down, for thousands of businesses to be affected worldwide.

Read Post

Catchpoint

Read more about Escalating risk, shrinking margins: The 2025 Internet Resilience Report

OpenTelemetry vs Fluent Bit - Key Differences 2025

Jun 24, 2025 By Pavithra Parthiban In Atatus

Modern applications demand strong observability to ensure performance, reliability, and quick troubleshooting. Two powerful open-source tools, OpenTelemetry and Fluent Bit play key roles in this space. While OpenTelemetry offers a full-stack framework for collecting metrics, logs, and traces, Fluent Bit specializes in fast, lightweight log forwarding.

Read Post

Atatus

Read more about OpenTelemetry vs Fluent Bit - Key Differences 2025

Coralogix adds OTel-based service dependency tracking for distributed systems

Jun 24, 2025 By Chris Cooney In Coralogix

Coralogix has released its APM Dependencies feature. This feature automatically surfaces and maps the relationships within and between your software and external services. It allows fine grained tracking of which endpoints within your APIs, depend on other endpoints, or external services and database tables.

Read Post

Coralogix

Read more about Coralogix adds OTel-based service dependency tracking for distributed systems

Top 14 Best Infrastructure Monitoring Tools & Solutions in 2025. Full Reviews and Side by Side Comparison

Jun 24, 2025 By Ehab Qadah In Sematext

As your business grows, so will your infrastructure and the number of applications or services running in it. In other words, forget about any sort of manual monitoring or home-grown scripts or tools if you want to keep your sanity. Whether you need performance metrics, service health and availability status, infrastructure, or application logs, you need a tool that will give you end-to-end visibility into the health of your infrastructure.

Read Post

Sematext

Read more about Top 14 Best Infrastructure Monitoring Tools & Solutions in 2025. Full Reviews and Side by Side Comparison

We did it! SquaredUp is now a B Corp

Jun 24, 2025 By Squared Up In Squared Up

After four years. Hundreds of meetings and conversations. Countless forms and paperwork submissions… …We’ve done it. SquaredUp is finally, officially B Corp certified We can now say that we are part of a unique global movement – and we couldn’t be more proud, excited, and motivated for our journey ahead.

Read Post

Squared Up

Read more about We did it! SquaredUp is now a B Corp

Mid-Year Update 2025

Jun 24, 2025 By Lance Erickson In Scout

It’s been a while since we’ve shared what’s been bubbling up behind these scenes are Scout. In this post, here’s an update on where we’ve been and where we’re going!

Read Post

Scout

Read more about Mid-Year Update 2025

Maintenance Window Improvements

Jun 24, 2025 By Leo Baecker In Hyperping

We've made major improvements to maintenance window notifications with flexible options that adapt to your communication strategy. Now you have three notification options for every maintenance window: You'll also see how many subscribers will be notified with a detailed breakdown of subscriber counts by channel type (Email, Slack, Teams, etc.), giving you complete visibility into your communication reach before sending.

Read Post

Hyperping

Read more about Maintenance Window Improvements

The Benefits of Using Juniper's Network Monitoring Tools for IT Operations

Jun 24, 2025 By OpsMatters In OpsMatters

More data means more complexities in IT networks. Hence, the right solution is needed to monitor such networks. Many companies struggle without the right tools, and they often lose great business opportunities because they are unable to identify performance-related issues upfront. Network monitoring is thus essential for business success. It helps build healthy network performance, saving companies money in the long run.

Read Post

OpsMatters

Read more about The Benefits of Using Juniper's Network Monitoring Tools for IT Operations

OpManager earns triple recognition in 2025

Jun 23, 2025 By Allan In ManageEngine

We’re pleased to share that ManageEngine OpManager has earned recognition across three critical areas of IT operations, achieving triple crown status in IT infrastructure management. OpManager has been featured in GetApp’s Category Leaders, Software Advice’s Front Runners, and Capterra’s Shortlist, in addition to being named in the Gartner Market Guide for IT Infrastructure Monitoring (ITIM).

Read Post

ManageEngine

Read more about OpManager earns triple recognition in 2025

The Visibility vs Cost Trap: A Dangerous Tradeoff

Jun 23, 2025 By The Graylog Team In Graylog

“You can’t investigate what you don’t have”. Every analyst knows the pain of missing context. You’re in the middle of a high-stakes investigation, but the logs you need are gone, archived weeks ago due to retention limits. Or worse, they were never collected in the first place to keep costs under control. This is the Visibility vs. Cost trap, and it puts analysts at a disadvantage every day.

Read Post

Graylog

Read more about The Visibility vs Cost Trap: A Dangerous Tradeoff

Harnessing Network Observability to Speed the Telco-to-Techco Transition

Jun 23, 2025 By Arnold Hoogerwerf In Broadcom

For telecommunications firms (telcos), the race is on. If these organizations are to rise to meet their top challenges and growth objectives, transformation is a must. Those who make this move most rapidly will be best positioned for sustained success. Today, telcos face several significant challenges, which are creating fundamental disruption: Telcos need to transform to contend with these shifts.

Read Post

Broadcom

Read more about Harnessing Network Observability to Speed the Telco-to-Techco Transition

Getting started with Cloudflare dashboards

Jun 23, 2025 By Sameer Mhaisekar In Squared Up

Cloudflare is a widely adopted web performance and security platform, best known for its CDN, DDoS protection, and DNS services. While it provides rich telemetry and real-time analytics, the sheer volume and complexity of the data can make it hard to identify key trends or issues at a glance. This is where a solution like SquaredUp (or another dashboarding tool) comes in.

Read Post

Squared Up

Read more about Getting started with Cloudflare dashboards

Alert Fatigue can Destroy Your SLA #Checkly #playwright #DevOps

Jun 23, 2025 By Checkly In Checkly

Optimizing Reliability: https://www.checklyhq.com/docs/monitoring/optimizing-reliability/

Read about the Checkly Import Tool: https://www.checklyhq.com/docs/cli/command-line-reference/#npx-checkly-import

View Video

Checkly

Read more about Alert Fatigue can Destroy Your SLA #Checkly #playwright #DevOps

Generating Playwright Tests With AI: Let's Try the New Playwright MCP Server!

Jun 23, 2025 By Checkly In Checkly

In this video, Stefan (Playwright Ambassador) dives into the integration of AI with the Playwright MCP server to automate end-to-end test generation. Learn about MCP, browser automation and how to combine everything to generate Playwright tests. We'll explore AI capabilities and limits and discuss best practices for generating accurate and reliable Playwright tests. If you're curious about leveraging AI for end-to-end testing with Playwright, this video is for you!

View Video

Checkly

Read more about Generating Playwright Tests With AI: Let's Try the New Playwright MCP Server!

AI-Augmented Control Plane: Scaling IT Operations with Intelligent Automation

Jun 23, 2025 By Datadog In Datadog

How do you enable a team of 100 engineers to effectively support 300+ critical applications across five hosting platforms? At Thomson Reuters, we turned to AI - not as a buzzword, but as a genuine force multiplier. Experience our journey of transforming traditional IT operations into an AI-augmented powerhouse, where Datadog, ServiceNow, and custom AI solutions work in harmony to create a next-generation control plane. We'll share real victories, honest challenges, and practical insights from our mission to build a more intelligent operational framework.

View Video

Datadog

Read more about AI-Augmented Control Plane: Scaling IT Operations with Intelligent Automation

Real User Monitoring (RUM) vs. Synthetic Monitoring: Understanding Best Practices

Jun 23, 2025 By Mohana Ayeswariya J In Atatus

For modern engineering and DevOps teams, user experience isn’t a post-deployment concern, it’s a critical operational metric. Monitoring how real users interact with your application is no longer optional, especially in high-traffic, dynamic, or global environments. This is where real user monitoring (RUM) proves indispensable. But RUM isn’t the only approach.

Read Post

Atatus

Read more about Real User Monitoring (RUM) vs. Synthetic Monitoring: Understanding Best Practices

Custom timeframes are here!

Jun 23, 2025 By Noorul Huda N In Squared Up

In the realm of data and observability, timing is everything. Until now, SquaredUp provided fixed time options like the last hour, 12 hours, 24 hours, last week, and this month. While these options served many users well, we recognized that they lacked the flexibility you needed. Whether you're tracking long-term performance, comparing trends, or looking into specific events, we know these preset options could sometimes feel limiting.

Read Post

Squared Up

Read more about Custom timeframes are here!

Farewell, Cherwell: Celebrations, Migrations, and Considerations

Jun 23, 2025 By solarwindsinc In SolarWinds

As Cherwell approaches end of life, IT service management (ITSM) teams are facing major decisions about what’s next. In this episode, SolarWinds host Sean Sebring sits down with Matt Neigh, a former Cherwell executive, and Michael Clark, a SolarWinds Solutions Engineer and former Cherwell admin, to reflect on Cherwell’s legacy, explore migration best practices, and discuss what to look for in a modern ITSM platform.

View Video

SolarWinds

Read more about Farewell, Cherwell: Celebrations, Migrations, and Considerations

How To Perform A DNS Check | Grafana Synthetic Monitoring

Jun 23, 2025 By Grafana In Grafana

Learn how to set up DNS checks using Grafana Cloud Synthetic Monitoring. In this video, we walk through how to create a DNS check and analyze test results.

View Video

Grafana

Read more about How To Perform A DNS Check | Grafana Synthetic Monitoring

Structured Logging in NextJS with OpenTelemetry

Jun 23, 2025 By Yuvraj Singh Jadon In SigNoz

Traces tell you what happened and when. Logs tell you why. When something breaks, logs are often your first clue—and if they’re correlated with traces, they can cut debugging time down from hours to minutes. In this section, we’ll wire up end-to-end structured logging across both server and browser environments in your Next.js app, complete with trace correlation and SigNoz integration.

Read Post

SigNoz

Read more about Structured Logging in NextJS with OpenTelemetry

Advanced Threshold Configurations in Site24x7

Jun 23, 2025 By ManageEngine Site24x7 In Site24x7

Are constant, trivial alerts overwhelming your IT and DevOps teams? In this video, learn how Site24x7's Advanced Thresholds provide smarter alerting by understanding meaningful patterns and anomalies, improving focus and response to real issues. We'll walk you through: Whether you're a system admin, network engineer, or IT manager, this feature helps you streamline alert management.

View Video

Site24x7

Monitoring

Read more about Advanced Threshold Configurations in Site24x7

Brand-Driven Observability: Crafting Monitoring That Reflects Your Product Identity

Jun 21, 2025 By OpsMatters In OpsMatters

In the fast-paced world of modern IT operations, observability has become a crucial pillar in ensuring the health, reliability, and performance of complex systems. As organizations scale their infrastructures and embrace distributed architectures, monitoring systems have evolved beyond simple uptime checks to holistic observability platforms. However, in this technical landscape, one often overlooked element is the role of branding in observability design.

Read Post

OpsMatters

Read more about Brand-Driven Observability: Crafting Monitoring That Reflects Your Product Identity

Sponsored Post

MariaDB Monitoring for Enhancing Performance, Availability, and Security

Jun 20, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

As organizations increasingly rely on MariaDB for their critical applications, ensuring optimal database performance, availability, and security becomes essential. This whitepaper provides a strategic guide to mastering MariaDB monitoring, helping IT teams proactively detect and resolve issues before they impact business operations.

Read Post

NiCE IT Mgmt

Read more about MariaDB Monitoring for Enhancing Performance, Availability, and Security

The one where we reminisce about RSAC

Jun 20, 2025 By Cribl In Cribl

RSAC is always a good pulse check on where the security world is headed — and this spring was no different. Join in as we reminisce and talk about the trends that are shaping IT and Security.

View Video

Cribl

Read more about The one where we reminisce about RSAC

Prometheus Logging Explained for Developers

Jun 20, 2025 By Prathamesh Sonpatki In Last9

Running apps in production? You need visibility fast. Traditional logging gives you scattered events. Prometheus gives you structured, queryable data that scales. In this guide, we’ll break down how to use Prometheus for logging-style observability, where it fits in your stack, and how to plug it into tools like Grafana or your cloud-native setup.

Read Post

Last9

Read more about Prometheus Logging Explained for Developers

How the Factry Historian data source for Grafana enables data-driven insights for factory teams

Jun 20, 2025 By Frederik Van Leeckwyck In Grafana

Frederik Van Leeckwyck is the co-founder and CRO at Factry. He oversees go-to-market activities and ensures their software solutions align with real factory demands. Passionate about open technologies, he believes in making data-driven insights accessible to everyone in the factory. Factories today are often rich in process data, but poor in insights.

Read Post

Grafana

Read more about How the Factry Historian data source for Grafana enables data-driven insights for factory teams

Fluentd vs Fluent Bit: A Side-by-Side Comparison 2025

Jun 20, 2025 By Aiswarya S In Atatus

Fluentd and Fluent Bit are both open-source data collection and processing tools, but they serve different purposes. Fluentd offers a comprehensive, plugin-rich architecture ideal for centralized log aggregation. Fluent Bit is designed for performance and efficiency, making it a better fit for edge devices and environments with limited resources. This Fluentd vs Fluent Bit comparison outlines their key differences, helping you decide which fits your infrastructure best.

Read Post

Atatus

Read more about Fluentd vs Fluent Bit: A Side-by-Side Comparison 2025

Highlight reel: Futureproof Your AI Investment With Observability

Jun 20, 2025 By Honeycomb In Honeycomb

Artificial intelligence is changing the way modern systems are built—and how teams are expected to and operate them. But as AI-driven complexity grows, so too does the need for deep, reliable, and fast visibility into what’s really happening inside our. In this timely and thought-provoking session, Christine Yen, CEO and Co-founder of Honeycomb, explores how practices must evolve to keep pace with.

View Video

Honeycomb

Read more about Highlight reel: Futureproof Your AI Investment With Observability

Monitor SSL certificates in StatusGator with TrackSSL

Jun 20, 2025 By Colin Bartlett In StatusGator

We’re excited to announce a new integration with our sister product, TrackSSL — making SSL certificate monitoring simpler and more powerful than ever, right within StatusGator.

Read Post

StatusGator

Read more about Monitor SSL certificates in StatusGator with TrackSSL

In Case You Missed it: DX NetOps Active Experience Launched

Jun 19, 2025 By Jason Normandin In Broadcom

There’s no doubt that managing networks today is a whole different ballgame than it used to be. Complexity is growing, environments are more fragmented, and user expectations have never been higher. One of the biggest challenges for network operations teams? Visibility—or the lack of it. Network operations used to be much simpler. Traffic flowed through your own data center, and you had the visibility and control needed to manage performance and troubleshoot issues.

Read Post

Broadcom

Read more about In Case You Missed it: DX NetOps Active Experience Launched

VictoriaLogs Unleashed: Cluster Version Now Available for Exceptional, Linear Scaling

Jun 19, 2025 By Jean-Jerome Schmidt-Soisson In VictoriaMetrics

You asked, and we listened! We’re thrilled to announce the release of the VictoriaLogs Cluster version – one of the most requested and anticipated updates from our user community. This marks a significant leap forward for VictoriaLogs, empowering users to handle log volumes and ingestion rates far beyond the limits of a single node.

Read Post

VictoriaMetrics

Read more about VictoriaLogs Unleashed: Cluster Version Now Available for Exceptional, Linear Scaling

What is Internet Jitter & How to Test It

Jun 19, 2025 By Alyssa Lamberti In Obkio

If you’ve ever had a user complain that their video call was choppy or their VoIP call had weird delays, even though the Internet speed looked fine, you’ve probably run into Internet jitter. It’s one of those issues that doesn’t always show up on a speed test, but it can absolutely wreck real-time communication. And if you’re managing networks across remote offices, home setups, or hybrid work environments, you’ll want to keep an eye on it.

Read Post

Obkio

Read more about What is Internet Jitter & How to Test It

The Cost of Bad Data: Why Time Series Integrity Matters More Than You Think

Jun 19, 2025 By Allyson Boate In InfluxData

Data plays a critical role in shaping operational decisions. From sensor streams in factories to API response times in cloud environments, organizations rely on time-stamped metrics to understand what’s happening and determine what to do next. But when that data is inaccurate or incomplete, systems make the wrong call. Teams waste time chasing false alerts, miss critical anomalies, and make high-stakes decisions based on flawed assumptions.

Read Post

InfluxData

Read more about The Cost of Bad Data: Why Time Series Integrity Matters More Than You Think

Blueprints Are Pre-Built Processor Bundles #opentelemetry #collector #observability

Jun 19, 2025 By Bindplane In ObservIQ

Check out the full ‪‪@bindplane community call in June. Here we explore automated JSON parsing using a JSON Parse processor bundle that was added with the Blueprint feature in Bindplane. Learn how to parse JSON strings, extract fields, and set accurate timestamps without having to add any custom configs. Bindplane handles all the heavy lifting automatically.

View Video

ObservIQ

Read more about Blueprints Are Pre-Built Processor Bundles #opentelemetry #collector #observability

Regex Log Parsing Made Easy with AI/LLM Support #opentelemetry #collector #observability

Jun 19, 2025 By Bindplane In ObservIQ

Check out the full ‪‪@bindplane community call in June. We explore Apache HTTP source and the new AI regex log parsing capabilities. We leverage a Bindplane processor for complex pattern matching, enabling efficient data processing. This guide demonstrates how to easily generate and apply regex patterns with AI support.

View Video

ObservIQ

Read more about Regex Log Parsing Made Easy with AI/LLM Support #opentelemetry #collector #observability

Bindplane Recommendation Engine: Automatically Improve Telemetry Parsing #opentelemetry #collector

Jun 19, 2025 By Bindplane In ObservIQ

Check out the full ‪‪@bindplane community call in June. See how Bindplane instantly suggests improvements using its recommendation engine. This video explores how to automatically parse severity with default values, enhancing data analysis efficiency. Learn how to quickly optimize your setup.

View Video

ObservIQ

Read more about Bindplane Recommendation Engine: Automatically Improve Telemetry Parsing #opentelemetry #collector

Docker Stop vs Kill: When to Use Each Command

Jun 19, 2025 By Anjali Udasi In Last9

When a container starts consuming excessive memory or becomes unresponsive, you need a way to shut it down. The two primary options — docker stop and docker kill,both terminate containers, but they operate differently and have different implications. The key difference: docker stop sends SIGTERM for a graceful shutdown, then escalates to SIGKILL if the process doesn’t exit in time. docker kill skips straight to SIGKILL, terminating the container immediately.

Read Post

Last9

Read more about Docker Stop vs Kill: When to Use Each Command

Using Splunk to Monitor the Security of MCP Servers

Jun 19, 2025 By Rod Soto In Splunk

In this blog we are going to address how to use Splunk to monitor security of MCP Servers, a new technology that has been developed by Anthropic and has definitely bridged the use of local applications with Large Language Models (LLMs).

Read Post

Splunk

Read more about Using Splunk to Monitor the Security of MCP Servers

From the source to the edge: the six agent types you can't ignore

Jun 19, 2025 By Gael Hernandez In Catchpoint

Recently, Catchpoint expanded our Global Agent Network to over 3,000 agents. In a crowded space, this is by far one of our key differentiators. At the time of writing, no one else boasts 395 providers in 105 countries and 346 cities. As Director of ISP Strategy, I’m not here to pat myself on the back—my real question is: why?

Read Post

Catchpoint

Read more about From the source to the edge: the six agent types you can't ignore

Building and Using a Custom #OpenTelemetry #Collector with #Bindplane

Jun 19, 2025 By Bindplane In ObservIQ

Check out the full ‪‪@bindplane community call in June. We explore building custom OpenTelemetry collectors with the OpenTelemetry Distribution Builder and using Bindplane's new Bring Your Own Collector feature. We showcase source and destination compatibility within Bindplane and how BYOC does not let you misconfigure a custom built collector.

View Video

ObservIQ

Read more about Building and Using a Custom #OpenTelemetry #Collector with #Bindplane

Catchpoint News Catchup Episode 2

Jun 19, 2025 By Catchpoint In Catchpoint

Join Sergey, Brandon, Payal, and Leon to talk about recent news from Wired, HackerNews, Mashable, and Figma. Links: Reach out to our hosts! Visit catchpoint.com for more information.

View Video

Catchpoint

Monitoring

Read more about Catchpoint News Catchup Episode 2

Top tips: Fly high with AI-benefits of artificial intelligence in aviation

Jun 19, 2025 By David Simon In ManageEngine

Top tips is a weekly column where we highlight what’s trending in the tech world today and list ways to explore these trends. This week, we’ll look at two ways AI is optimizing flying for the passenger as well as the airline. "Brace! Brace! Brace!" Simple request. Serious consequences. Something that could get even the most vocal atheist to start praying. Something that no one would ever wish to hear in their lifetime.

Read Post

ManageEngine

Read more about Top tips: Fly high with AI-benefits of artificial intelligence in aviation

What is Real User Monitoring (RUM)?

Jun 19, 2025 By Janani In Atatus

As applications grow more complex and user expectations rise, delivering seamless and high-performing experiences to users is non-negotiable. Real User Monitoring (RUM) has emerged as an essential technique that provides developers, DevOps teams, and site reliability engineers with deep visibility into the actual performance of web applications that capture the experiences of real people in real-time.

Read Post

Atatus

Read more about What is Real User Monitoring (RUM)?

Chaos Testing a PostgreSQL Cluster: How Kubernetes Can Restore Replica Failures (in 30 Seconds)

Jun 19, 2025 By Coroot In Coroot

🐧🐝 Use open source, automatic eBPF observability to quickly fix Patroni failures in your kubernetes cluster: https://t.ly/qBH9f

#devops #opensource #observability #kubernetes #postgresql

View Video

Coroot

Read more about Chaos Testing a PostgreSQL Cluster: How Kubernetes Can Restore Replica Failures (in 30 Seconds)

The Future of Internet Performance Monitoring | Spring Product Launch

Jun 19, 2025 By Catchpoint In Catchpoint

Watch our live product launch, hosted by Matt Izzo, Chief Product Officer, Howard Beader, VP of Product Marketing, and Leon Adato, Principal Technology advocate.

View Video

Catchpoint

Monitoring

Read more about The Future of Internet Performance Monitoring | Spring Product Launch

Why Clarity Demands More Than Dashboards

Jun 18, 2025 By Raja Shekar Mulpuri In HEAL Software

Despite years of investment in observability stacks and AI dashboards, most IT organizations still struggle with one uncomfortable truth: they can’t identify root cause in real time, and they can’t explain how technical failures impact the business. Not in dollars. Not in user flows. Not in boardroom language. What’s worse, they often don’t realize what they’re missing.

Read Post

HEAL Software

Read more about Why Clarity Demands More Than Dashboards

WWDC 2025: What's new for enterprise device management

Jun 18, 2025 By Manish from ManageEngine In ManageEngine

Apple’s WWDC 2025 delivered a wave of exciting updates for anyone involved in managing company devices. With improvements designed to simplify provisioning, strengthen app controls, and expand what Apple Business Manager can do, these changes are all about making life easier for IT teams irrespective of the industry. In this article, we’ll break down the key announcements and explore how they could reshape the way you manage your organization’s Apple devices.

Read Post

ManageEngine

Read more about WWDC 2025: What's new for enterprise device management

Creating a Java monitoring strategy for high-availability systems

Jun 18, 2025 By Angeline Solomon In ManageEngine

High-availability (HA) systems form the backbone of modern enterprise applications. In today's always-on world, Java applications are expected to deliver consistent performance with minimal downtime. However, achieving this critical objective is impossible without a well-defined and executed monitoring strategy. A robust Java monitoring approach is essential to ensure resilience, uptime, and peak performance.

Read Post

ManageEngine

Read more about Creating a Java monitoring strategy for high-availability systems

Netdata Implements MCP Protocol

Jun 18, 2025 By Costa Tsaousis In netdata

We are excited to announce that Netdata has officially implemented the Model Context Protocol (MCP), joining the forefront of AI-powered infrastructure monitoring.

Read Post

netdata

Read more about Netdata Implements MCP Protocol

Securing AI with AI-SPM: The Next Step in AI Risk Management

Jun 18, 2025 By Teneo In Teneo

The conversations around artificial intelligence (AI) typically revolve around its vast potential: writing applications, automating tasks, or transforming entire industries. However, despite the excitement around AI’s potential, the more pressing issue for many organizations is how to manage the risks of deploying it at scale across the enterprise. This is where AI Security Posture Management (AI-SPM) comes into play.

Read Post

Teneo

Read more about Securing AI with AI-SPM: The Next Step in AI Risk Management

Getting Started with Traceroute

Jun 18, 2025 By Leon Adato In Catchpoint

“Traceroute? You mean the thing I can type at the command line? Why would I even want to set up a test for that?” This is, believe it or not, a comment we hear a lot at Catchpoint. At least from folks who are either new to tech, new to monitoring, or new to Catchpoint (or all three). It’s a common misconception. It’s also something I’m not going to spend a ton of time addressing here. This blog is not meant to convince you why traceroute is super useful (even though it is).

Read Post

Catchpoint

Read more about Getting Started with Traceroute

Access Logs: Format Specification and Practical Usage

Jun 18, 2025 By Anjali Udasi In Last9

Your server's been logging everything—it’s just easy to overlook until something breaks. Every incoming request, database call, or auth check ends up in your access logs. They’re not flashy, but they quietly document every interaction your system handles. For developers, they’re often the most reliable starting point when things go wrong. In this blog, we'll take a look at what an access log is, its format, types, and a few best practices.

Read Post

Last9

Read more about Access Logs: Format Specification and Practical Usage

Custom Webhooks: a Quick Example

Jun 18, 2025 By Honeycomb In Honeycomb

When an alert triggers in Honeycomb, how do you want to hear about it? Send detailed information however you want it with custom webhooks.

View Video

Honeycomb

Read more about Custom Webhooks: a Quick Example

A Demo Walkthrough of Seer, Sentry's AI Debugger

Jun 18, 2025 By Sentry In Sentry

Watch Cody walk us through how to use Seer and how quickly it can root cause issues to save you precious time.

View Video

Sentry

Read more about A Demo Walkthrough of Seer, Sentry's AI Debugger

Honeycomb Observability Day London: A Jam-Packed Day of Great Talks

Jun 18, 2025 By Ken Rimple In Honeycomb

On May 15th, 2025, Honeycomb hosted Observability Day (or O11yDay) in the London financial district. The skies were clear and the weather was wonderful and we had a huge turnout, from our networking breakfast to the happy hour at the end of the day.

Read Post

Honeycomb

Read more about Honeycomb Observability Day London: A Jam-Packed Day of Great Talks

A New Look At Dependencies: Icinga Dependency Views

Jun 18, 2025 By Johannes Meyer In Icinga

We’re excited to share that Icinga now offers an improved way to view dependencies. With the releases of Icinga DB Web 1.2.0, Icinga DB 1.4.0, and Icinga 2.15.0 today, any dependencies you’ve set up in Icinga will now be visually represented. Additionally, we’re introducing a new enterprise feature called Icinga Dependency Views, available through an Icinga subscription. This component expands Icinga DB Web with even more powerful capabilities.

Read Post

Icinga

Read more about A New Look At Dependencies: Icinga Dependency Views

Get more out of Sumo Logic: five log search hacks you'll actually use

Jun 18, 2025 By Carlos Solano In Sumo Logic

Think Sumo Logic is only for query language pros? Think again. Whether you’re deep into JSON logs or just trying to make sense of a Linux error message, these five time-saving hacks turn anyone into a log-searching ninja, no regex, no complexity, just clicks. From instantly parsing values to filtering down with a tap, these tips will help you troubleshoot faster, work smarter, and feel more confident in your observability game. You’ve got logs, now it’s time to put them to work.

Read Post

Sumo Logic

Read more about Get more out of Sumo Logic: five log search hacks you'll actually use

LLM Observability for Reliability and Stability: A Monitoring Strategy for Phone Communication

Jun 18, 2025 By Datadog In Datadog

LLM APIs offer groundbreaking potential, but also present challenges such as response latency, hallucinations, and service instability. In Japan, where telephone communication remains crucial for business, these issues present significant barriers to the introduction of LLM-based applications. Despite being a relatively young startup, we have developed and deployed an LLM-based telephone service with over 40 million calls.

View Video

Datadog

Read more about LLM Observability for Reliability and Stability: A Monitoring Strategy for Phone Communication

An open source tool to speed up iOS app launch

Jun 18, 2025 By Noah Martin In Sentry

What do the Snapchat, Airbnb, and Spotify iOS apps have in common? They all use order files to speed up their iOS app launch times. Order files re-order your binary to improve how symbols are loaded into memory. No code changes are necessary, but generating an optimized order file can be cumbersome, so it’s mostly done by larger teams or teams willing to pay for a service like Emerge Tools’ Launch Booster. It just so happens that Emerge Tools is now part of Sentry.

Read Post

Sentry

Read more about An open source tool to speed up iOS app launch

Log Management and Query Optimization in Kibana

Jun 18, 2025 By Faiz Shaikh In Last9

When troubleshooting with the Elastic Stack, Kibana is often the interface you’ll rely on to query and visualize logs. It doesn’t change the data—it just makes it searchable and a bit easier to work with under pressure. If you’re investigating an outage, tracking performance issues, or trying to correlate events across services, Kibana’s log exploration tools can speed up the process, assuming they’re configured and used well.

Read Post

Last9

Read more about Log Management and Query Optimization in Kibana

Mastering Global Telemetry: How Cribl Puts You in Control

Jun 18, 2025 By Alexandra Gates In Cribl

Let’s face it: managing global data infrastructure isn’t just hard, it’s “I-just-deployed-the-wrong-config-to-prod-again” hard. If you’re a Cribl Admin or Operator working across clouds, continents, and compliance regimes, your to-do list probably reads like a series of increasingly desperate Post-it notes. Sources. Destinations. Pipelines. TLS settings. Proxies. Dev, staging, prod. Repeat. Forever. But what if we told you there’s a better way?

Read Post

Cribl

Read more about Mastering Global Telemetry: How Cribl Puts You in Control

The one where we show you what's next for Cribl Copilot

Jun 18, 2025 By Cribl In Cribl

Join Ed Bailey and Sydnee Mayers as they discuss what's in store for the future of Cribl Copilot!

View Video

Cribl

Read more about The one where we show you what's next for Cribl Copilot

7 Critical Insider Threat Indicators and How to Detect Them

Jun 18, 2025 By Filip Cerny In Flowmon

Cybersecurity threats don’t come solely from external attackers. Insider threats also require your attention. Insider risk originates from employees, contractors or business partners who possess legitimate access to IT systems for their work tasks. They can access valuable data and systems that, if exposed or have some data stolen, could harm an organization’s reputation.

Read Post

Flowmon

Read more about 7 Critical Insider Threat Indicators and How to Detect Them

Smarter billing insights, better cost control

Jun 18, 2025 By Cribl In Cribl

Cribl's billing and usage visibility has a brand new look! Check it out to track usage trends and optimize spending.

View Video

Cribl

Read more about Smarter billing insights, better cost control

The hype is over: Generative AI is driving the evolution of search within enterprises

Jun 17, 2025 By Jessica Taylor In Elastic

Discover how Accenture and Elastic are helping businesses seize the opportunities offered by generative AI When it comes to generative AI, enterprises need to think big. Shaving a few seconds off the time needed to draft an email is helpful, but the journey to real value begins when you apply AI at the enterprise level. A new partnership between Accenture and Elastic combines technical expertise and strategic excellence, enabling businesses to build the data foundations for a successful AI future.

Read Post

Elastic

Read more about The hype is over: Generative AI is driving the evolution of search within enterprises

How to achieve Root Cause Analysis in 30 seconds: Chaos testing a PostgreSQL cluster

Jun 17, 2025 By Coroot In Coroot

🐧🐝 Use open source, automatic eBPF observability to quickly fix Patroni failures in your kubernetes cluster: https://t.ly/qBH9f

View Video

Coroot

Read more about How to achieve Root Cause Analysis in 30 seconds: Chaos testing a PostgreSQL cluster

Network Latency: Types, Causes, and Fixes

Jun 17, 2025 By Anjali Udasi In Last9

Sometimes your API call takes a few seconds longer than expected. Or users start reporting slow page loads. One of the most common reasons? Network latency.

Read Post

Last9

Read more about Network Latency: Types, Causes, and Fixes

Configure and customize Kubernetes Monitoring easier with Alloy Operator

Jun 17, 2025 By Pete Wall In Grafana

What if you were to tell Kubernetes Monitoring what you wanted, and the system configured collectors based on your choices? We wondered that as well—wondered enough to create Alloy Operator and its Helm chart for version 3.0 of the Kubernetes Monitoring Helm chart. We’re excited to share that the new Kubernetes Monitoring Helm chart is now available, and it introduces a dynamic way of setting up your telemetry data collection with Alloy Operator.

Read Post

Grafana

Read more about Configure and customize Kubernetes Monitoring easier with Alloy Operator

Why Modern Incident Response Strategies Need Network and Service Intelligence: Part 2

Jun 17, 2025 By Connor Tye In Splunk

In Part 1, we explored how aligning network visibility with IT service context empowers faster, smarter incident response. But what does this actually look like? Here in Part 2, we’ll go deeper into the challenges of traditional monitoring approaches, and how teams should look to move from fragmented alerts to unified insights – because when ITOps and NetOps can both see the “what” & “why” of the problem, actions become instinct.

Read Post

Splunk

Read more about Why Modern Incident Response Strategies Need Network and Service Intelligence: Part 2

Guide for Catching Regressions with GitHub Actions and CI/CD Monitors

Jun 17, 2025 By Sematext In Sematext

This guide aims to help your team shift testing left, simulate real user behavior, and catch critical issues early as part of CI/CD, prevent regressions from reaching production by automating tests as part of your CI/CD and aborting deployments that contain issues. Synthetic monitoring is a great way to check important flows in production and make sure everything is working the way it’s supposed to.

Read Post

Sematext

Read more about Guide for Catching Regressions with GitHub Actions and CI/CD Monitors

Optimize Your Event Analysis: Reports, Dynamic Filters, and Log Parsing in Pandora FMS SIEM

Jun 17, 2025 By Pandora FMS team In Pandora FMS

The latest Pandora FMS version presents key improvements to the SIEM, module, designed to enhance security event detection and management. These new features are available starting with Feature Release 782, allowing for optimized log analysis, report generation, and rule validation in distributed IT environments.

Read Post

Pandora FMS

Read more about Optimize Your Event Analysis: Reports, Dynamic Filters, and Log Parsing in Pandora FMS SIEM

Azure CDN for Static Assets, APIs, and Front Door

Jun 17, 2025 By Faiz Shaikh In Last9

If your users are spread across the globe but your servers are sitting in Virginia, you’ll probably hear complaints about slow load times, especially from places like Australia. CDNs fix this by caching static assets closer to where your users are. Azure CDN does exactly that, and it fits well if you're already using Azure services. You can hook it up to Blob Storage, App Services, or your origin. This guide covers how to set it up, what to expect, and how to know it’s working.

Read Post

Last9

Read more about Azure CDN for Static Assets, APIs, and Front Door

Seer, Sentry's AI Debugger, is Generally Available

Jun 17, 2025 By Tillman Elser In Sentry

Tired of trying to guess if that half-baked LLM suggestion is really going to fix the issue with your code? Meet Seer—our new AI agent that taps into all the issue context from Sentry and your codebase to not just guess, but root cause gnarly issues and propose merge-ready fixes specific to your application. Code gen tools are great fun—and useful. But even a recent Microsoft study confirmed what you already know: AI struggles with debugging.

Read Post

Sentry

Read more about Seer, Sentry's AI Debugger, is Generally Available

How Network Configuration Automation Improves Security and Efficiency

Jun 17, 2025 By ScienceLogic In ScienceLogic

Let’s face it: the modern enterprise network is a leviathan. No longer just a collection of routers and switches, today’s networks span multiple clouds, hundreds of SaaS applications, and countless IoT devices—supporting a workforce that could be anywhere.

Read Post

ScienceLogic

Read more about How Network Configuration Automation Improves Security and Efficiency

Change Management in Pandora ITSM with Full Traceability and Custom Workflows

Jun 17, 2025 By Pandora FMS team In Pandora FMS

With version 106 of Pandora ITSM, a critical feature has been introduced for technology environments operating under security frameworks, regulatory compliance, and efficient management: Change Management. This new module allows changes to be registered, approved, implemented, and closed in a structured way, with full traceability and responsibility control.

Read Post

Pandora FMS

Read more about Change Management in Pandora ITSM with Full Traceability and Custom Workflows

AutoCon3: Network Automation's Premier Conference

Jun 17, 2025 By Justin Ryburn In Kentik

AutoCon3 in Prague offered important takeaways on network automation’s evolution, from hands-on learning and design principles to the impact of AI and the power of community. Read Justin Ryburn’s recap to learn about key insights from the event, showing why network automation is now a core competency you’ll want to understand.

Read Post

Kentik

Read more about AutoCon3: Network Automation's Premier Conference

How InfluxDB 3 Enterprise Delivers 10-Millisecond Queries Over Historical Time Series Data

Jun 17, 2025 By Suyash Joshi In InfluxData

Time series data, such as IoT sensor readings or stock market ticks, flow in fast, often at a rate of millions of points per second. Querying this data, especially years of historical records, can be slow and painful if using a nonspecialized database rather than a time series database like InfluxDB.

Read Post

InfluxData

Read more about How InfluxDB 3 Enterprise Delivers 10-Millisecond Queries Over Historical Time Series Data

Say Hello to a Nicer (And More Readable) UI in #playwright 1.53!

Jun 17, 2025 By Checkly In Checkly

Join Stefan Judis, Playwright ambassador, as he shows some minor but invaluable updates to the Playwright UI mode, trave files and the HTML reporter.

View Video

Checkly

Read more about Say Hello to a Nicer (And More Readable) UI in #playwright 1.53!

Best Network Traffic Generator and Simulator Stress Test Tools

Jun 17, 2025 By Staff Contributor In SolarWinds

Benchmarking the environment of a new network is a crucial part of ensuring its success when it goes live. This includes stress testing and generating traffic on existing networks, both of which help you to identify any potentially flawed or vulnerable areas—for example, drops in connection and packet loss. As we know, network traffic is critical to the success of a business, as it determines how data flows and how effectively your applications interact.

Read Post

SolarWinds

Read more about Best Network Traffic Generator and Simulator Stress Test Tools

Webinar Snippet: Internet Troubleshooting with Obkio's "Sandwich Method"

Jun 17, 2025 By Obkio In Obkio

This is a snippet from our full webinar: “Troubleshooting Internet Issues: For Dummies & IT Pros” In this clip, we dive into Obkio’s Sandwich Method, a simple yet powerful approach to monitoring and identifying Internet issues. By placing Monitoring Agents: Inside your LAN At your firewall or in the DMZ And over the Internet…you can break your network into clear segments and pinpoint exactly where performance problems are happening — whether it’s local, at the network edge, or in the hands of your ISP.

View Video

Obkio

Read more about Webinar Snippet: Internet Troubleshooting with Obkio's "Sandwich Method"

You have 3 seconds... that's it.

Jun 17, 2025 By Catchpoint In Catchpoint

You have 3 seconds... that’s it. Today, users lose patience fast. A 3 second delay in page load time leads to 40% of users abandoning your site. This leads to damaged reputation, decrease in customer trust, and loss of revenue. What does that mean for you? Every millisecond counts. If you're not measuring your performance from your users' point of view, you might be missing a chance to convert them into customers.

View Video

Catchpoint

Monitoring

Read more about You have 3 seconds... that's it.

16 common mistakes C#/.NET developers make (and how to avoid them)

Jun 17, 2025 By Ali Hamza Ansari In elmah.io

As developers, we often fall into common pitfalls that impact the performance, security, and scalability of our applications. From neglecting data validation to overengineering, and from ignoring async/await to mishandling resource disposal, even experienced C# developers can make these mistakes. In this post, I've gathered some of the most frequent issues developers encounter in C# and how to avoid them with practical solutions.

Read Post

elmah.io

Read more about 16 common mistakes C#/.NET developers make (and how to avoid them)

The Visibility Gap: Bridging Flow and Metrics

Jun 16, 2025 By Eric Hian-Cheong In Kentik

The tools we use shape how we see problems. When flow and metrics are siloed, so is your visibility.

Read Post

Kentik

Read more about The Visibility Gap: Bridging Flow and Metrics

Introducing Netdata Insights

Jun 16, 2025 By Netdata In netdata

Now in research preview: Netdata Insights The problem: Incident? You're jumping between dashboards, piecing together timelines. Reporting? You're copy-pasting charts and correlating trends by hand. The data’s there, but turning it into a narrative doesn’t scale. The solution: Netdata Insights. Synthesizes our high-fidelity telemetry using the latest LLMs into AI-powered reports with natural-language explanations, visuals, and clear recommendations.

View Video

netdata

Read more about Introducing Netdata Insights

Edwin AI Turns One: What a Year of Agentic AIOps Looks Like

Jun 16, 2025 By LogicMonitor In LogicMonitor

Twelve months ago, we shipped Edwin AI with a specific hypothesis that AI agents could handle the operational drudgery slowing down ITOps teams. It was a deliberate bet against the cautious consensus that AI should act only as a copilot, limited to offering suggestions. Most AIOps tools still follow that script. They’re stuck surfacing insights and stop short of action. Edwin was built differently. It was designed to make decisions, correlate events, and execute fixes.

Read Post

LogicMonitor

Read more about Edwin AI Turns One: What a Year of Agentic AIOps Looks Like

Monitor Your Kubernetes Cluster: Get Started in Four Minutes

Jun 16, 2025 By Anand Rajanala In Broadcom

For enterprises embracing Kubernetes, managing these intricate environments can pose significant challenges. Thankfully, monitoring of Kubernetes clusters is readily achievable using the Universal Monitoring Agent (UMA) in conjunction with DX Operational Observability (DX O2).

Read Post

Broadcom

Read more about Monitor Your Kubernetes Cluster: Get Started in Four Minutes

EventSentry Security Dashboards

Jun 16, 2025 By NETIKUS.NET LTD In EventSentry

Outlines how the 3 EventSentry dashboards can help improve the security of any Windows-based network by.

View Video

EventSentry

Read more about EventSentry Security Dashboards

Tales From the Trench: Building With LLMs and Honeycomb

Jun 16, 2025 By Austin Parker In Honeycomb

AI discourse these days is all over the place. Depending on who you talk to, AI’s are absolute flash-in-the-pan junk, or they’re the best thing since sliced bread. I want to cut through the noise, though, and see for myself what someone can do out here on the bleeding edge. Thus, I’m setting myself a challenge: write a usable—and useful—application with Claude Code, from soup to nuts. Here are the rules: With our ground rules established, let’s figure out our app!

Read Post

Honeycomb

Read more about Tales From the Trench: Building With LLMs and Honeycomb

Adaptive alerting: faster, better insights with the new metrics forecasting UI in Grafana Cloud

Jun 16, 2025 By Gerry Boland In Grafana

In Grafana Cloud, we offer a range of AI capabilities to support your observability needs, including a feature for forecasting on any of your metrics and coupling it with Grafana Alerting. This is critical functionality if you want to make the switch from reactive to proactive alerting, as troubleshooting a problem before it arises is an important part of modern observability.

Read Post

Grafana

Read more about Adaptive alerting: faster, better insights with the new metrics forecasting UI in Grafana Cloud

Kubernetes CPU Limit: How to Set and Optimize Usage

Jun 16, 2025 By Staff Member In SolarWinds

Kubernetes makes it easy to scale applications. But when it comes to CPU resource management, a poorly tuned cluster can quickly become unstable or inefficient. For network engineers, setting CPU requests and limits correctly—and understanding the deeper implications—is essential for keeping workloads efficient, costs predictable, and noisy neighbors in check.

Read Post

SolarWinds

Read more about Kubernetes CPU Limit: How to Set and Optimize Usage

Overview of Dashboard

Jun 16, 2025 By Uptime Website Monitoring In uptime

Get a complete overview of the Uptime.com Dashboard in this video! Learn how to monitor your checks, customize the layout, and analyze global uptime metrics and alerts. See how to view response times, sort check cards, manage alerts, adjust auto-refresh settings, and save multiple personalized dashboards. Whether you're tracking uptime, organizing by tags, or customizing alerts, this guide covers everything you need to make the most of your dashboard.

View Video

uptime

Read more about Overview of Dashboard

June 2025 Product Updates

Jun 16, 2025 By Leo Baecker In Hyperping

We're excited to share major product improvements this month, highlighted by the full launch of our on-call management system that many of you have been eagerly awaiting.

Read Post

Hyperping

Read more about June 2025 Product Updates

Introducing Sentry's Flutter SDK 9.0 - Logs, Session Replay, Feature Flags, and more

Jun 16, 2025 By Gino Buenaflor In Sentry

If you've ever had to debug a Flutter app after an error report that just says “Null check operator used on a null value,” you already know: context is everything. And context can be hard to come by when you’re juggling native code, Dart, async stack traces, and platform channels. With v9 of our Flutter SDK, we’re introducing some features to help you get even more visibility into what’s going wrong, with the insights to make it better. Here’s what’s new.

Read Post

Sentry

Read more about Introducing Sentry's Flutter SDK 9.0 - Logs, Session Replay, Feature Flags, and more

The role of network automation in AI-driven businesses

Jun 16, 2025 By Rebecca Grassing In Auvik

AI adoption is accelerating across nearly every industry. According to McKinsey’s 2025 State of AI report, 78% of organizations now use AI in at least one business function, up from just 55% the year prior. From real-time analytics to generative tools and process automation, AI is becoming a fundamental part of how modern businesses operate and compete.

Read Post

Auvik

Read more about The role of network automation in AI-driven businesses

How to achieve better observability and faster MTTR with the App Health Summary page

Jun 15, 2025 By Coroot In Coroot

🐧🐝 Try Coroot fully open source or Enterprise and check out the latest open source observability tips on our blog: https://t.ly/qBH9f

View Video

Coroot

Read more about How to achieve better observability and faster MTTR with the App Health Summary page

Serverless vs. Containers: A Comprehensive Guide to Choosing the Right Solution

Jun 15, 2025 By Staff Member In SolarWinds

In the rapidly evolving world of cloud computing, network engineers often need to decide between serverless computing and containerization. Both technologies offer unique advantages and are suited to different types of applications. This article aims to provide a comprehensive comparison of serverless computing and containers, helping network engineers make an informed decision based on their specific needs.

Read Post

SolarWinds

Read more about Serverless vs. Containers: A Comprehensive Guide to Choosing the Right Solution

Run #playwright tests from multiple countries with #checkly

Jun 15, 2025 By Checkly In Checkly

Global Locations and Scheduling Strategies with Checkly: https://www.checklyhq.com/docs/monitoring/global-locations/

View Video

Checkly

Read more about Run #playwright tests from multiple countries with #checkly

Defining SLA/SLO-Driven Monitoring Requirements in 2025

Jun 15, 2025 By Alexandr Bandurchin In Uptrace

SLA/SLO-driven monitoring aligns your observability strategy with business objectives by defining measurable service targets and implementing monitoring systems that track progress toward those goals. Service Level Agreements (SLAs) represent commitments to users, while Service Level Objectives (SLOs) are internal targets that ensure you meet those commitments with a safety buffer. In 2025, organizations running distributed systems need monitoring that goes beyond basic uptime checks.

Read Post

Uptrace

Read more about Defining SLA/SLO-Driven Monitoring Requirements in 2025

Cloud Cost Optimization Best Practices, Strategies, and Tools to Reduce Bills

Jun 14, 2025 By Staff Member In SolarWinds

As network engineers, you play a crucial role in managing cloud infrastructure that supports your organization’s applications and services. Cloud platforms offer immense flexibility and scalability, but without careful cost management, expenses can quickly spiral out of control.

Read Post

SolarWinds

Read more about Cloud Cost Optimization Best Practices, Strategies, and Tools to Reduce Bills

How to Use an SLA Uptime Calculator to Understand Service Availability

Jun 13, 2025 By Simon Rodgers In WebSitePulse

TL;DR A Service Level Agreement (SLA) defines the required uptime for a service. An SLA uptime calculator helps convert uptime percentages into actual allowed downtime across different timeframes. This guide explains how these calculators work, why uptime matters, and how to monitor performance to meet SLA targets.

Read Post

WebSitePulse

Read more about How to Use an SLA Uptime Calculator to Understand Service Availability

OpenTelemetry for Go: measuring the overhead

Jun 13, 2025 By Nikolay Sivko In Coroot

Everything comes at a cost — and observability is no exception. When we add metrics, logging, or distributed tracing to our applications, it helps us understand what’s going on with performance and key UX metrics like success rate and latency. But what’s the cost? I’m not talking about the price of observability tools here, I mean the instrumentation overhead.

Read Post

Coroot

Read more about OpenTelemetry for Go: measuring the overhead

Data Lake Preview and Data Retrieval

Jun 13, 2025 By Graylog In Graylog

Follow Rich Murphy Director of Product Management for Security walking you through Data Lake Preview and Retrieval.

View Video

Graylog

Read more about Data Lake Preview and Data Retrieval

New TraceQL features in Grafana Tempo 2.8 | Grafana

Jun 13, 2025 By Grafana In Grafana

In this video, Joe Elliott does a deep dive demo into the experimental TraceQL features in Grafana Tempo 2.8.

View Video

Grafana

Read more about New TraceQL features in Grafana Tempo 2.8 | Grafana

How to query Grafana Loki with LogQL

Jun 13, 2025 By Grafana In Grafana

In this video, Senior Developer Advocate Nicole van der Hoeven talks about how to query Grafana Loki with LogQL. She talks about different ways to query Loki without using LogQL, how to get started with LogQL, and the three types of expressions that can be used in LogQL.

View Video

Grafana

Read more about How to query Grafana Loki with LogQL

Everything You Need to Know About Event Logs

Jun 13, 2025 By Faiz Shaikh In Last9

Your code passes locally, CI is green, and the deploy goes through. Then production throws a 500, and the trace isn’t helpful. And here, event logs help. A log captures timestamped records of what the app did HTTP requests, DB queries, cache misses, retries, failures. These entries give you enough context to debug without reproducing the issue locally. Especially when dealing with distributed systems, logs are often the only consistent source of truth.

Read Post

Last9

Read more about Everything You Need to Know About Event Logs

Invisible dependencies, visible impact: Lessons from the Google Cloud outage

Jun 13, 2025 By Catchpoint Team In Catchpoint

June 12, 2025. A date most of the Internet won’t remember — but anyone relying on Google Cloud will. In the span of minutes, a routine quota update snowballed into global disruption. APIs stopped responding. Dashboards stayed green. And across continents, teams scrambled to figure out if the problem was theirs — or Google's. It wasn’t a cyberattack. It wasn’t a datacenter fire.

Read Post

Catchpoint

Read more about Invisible dependencies, visible impact: Lessons from the Google Cloud outage

Observability trends in Japan: Insights from Grafana Labs' latest survey

Jun 13, 2025 By Trevor Jones In Grafana

Japanese organizations are focused on controlling costs and limiting complexity—and they might be getting ready to broaden their adoption at just the right time, according to analysis of a micro survey on observability recently conducted by Grafana Labs. Observability is an evolving space in Japan, and this is the first time Grafana Labs has run a Japanese version of our annual Observability Survey.

Read Post

Grafana

Read more about Observability trends in Japan: Insights from Grafana Labs' latest survey

The One With A Cribl Copilot Overview

Jun 13, 2025 By Cribl In Cribl

When you run critical IT or Security systems, you're on the hook if an alert is missed or a pipeline drops data. That is why Cribl Copilot never tries to replace you. Instead, it works as a partner you can question, guide, and overrule. Join us for this overview of Cribl Copilot!

View Video

Cribl

Read more about The One With A Cribl Copilot Overview

New: Status modal integration is here!

Jun 13, 2025 By Colin Bartlett In StatusGator

At StatusGator, our mission is to make status transparency effortless—for you and your users. Today, we’re introducing a new way to keep your users informed in real time: the Status Modal Embed. Now you can display a compact, customizable modal on your website that shows the current status of your services—incidents, maintenance, or full operational status—all with a direct link to your full status page.

Read Post

StatusGator

Read more about New: Status modal integration is here!

A guide to PHP exception handling

Jun 13, 2025 By Mauro Chojrin In Honeybadger

In most object-oriented languages, exceptions are an extremely powerful mechanism for dealing with unexpected situations that arise when running your code. PHP has supported robust exception handling since PHP 7.0. As you begin your programming journey, exceptions are a source of tremendous pain. Over time, you grow to appreciate the value they bring.

Read Post

Honeybadger

Read more about A guide to PHP exception handling

How to reduce Cloud Costs (with Open Source!)

Jun 13, 2025 By Coroot In Coroot

We strongly believe that simple observability should be an innovation everyone can afford to benefit from: which is why Coroot is open source, and includes cost monitoring for Azure, GCP, AWS, or your own custom settings. eBPF automatically tracks how each deployment impacts your cloud costs, so you can easily roll back changes and avoid lovecraftian monthly bill when necessary.

View Video

Coroot

Read more about How to reduce Cloud Costs (with Open Source!)

How to Set Up a Syslog Server: A Complete Step-By-Step Guide

Jun 13, 2025 By Staff Member In SolarWinds

Syslog servers are essential for centralized log management, helping network engineers monitor, troubleshoot, and secure network devices efficiently. This guide walks you through setting up a syslog server from scratch, focusing on practical steps using rsyslog on a Linux system—a common and robust choice for syslog collection. Windows does not have a native syslog server, so you need third-party software.

Read Post

SolarWinds

Read more about How to Set Up a Syslog Server: A Complete Step-By-Step Guide

Monitoring your Nextjs application using OpenTelemetry

Jun 12, 2025 By Sai Deepesh In SigNoz

Nextjs is a production-ready React framework for building single-page web applications. It enables you to build fast and user-friendly static websites, as well as web applications using Reactjs. Using OpenTelemetry Nextjs libraries, you can set up end-to-end tracing for your Nextjs applications. Nextjs has its own monitoring feature, but it is only limited to measuring the metrics like core web vitals and real-time analytics of the application.

Read Post

SigNoz

Read more about Monitoring your Nextjs application using OpenTelemetry

Using the Sentry Unity SDK for Error and Crash Reporting

Jun 12, 2025 By Erin T In Sentry

It’s one thing to debug your game during development, but once your game is in production, visibility into errors and failures in real gameplay is harder to achieve. And when something does go wrong, players are more likely to ragequit than submit a bug ticket.

Read Post

Sentry

Read more about Using the Sentry Unity SDK for Error and Crash Reporting

Grafana Tempo 2.8 release: memory improvements, new TraceQL features, and more

Jun 12, 2025 By Tiffany Jernigan In Grafana

Grafana Tempo 2.8 is officially here, delivering new TraceQL features, performance improvements, and bug fixes, as well as some breaking changes. Watch the video below to learn more about the TraceQL features, or continue reading to get a quick overview of these and other updates. If you’re looking for something more in-depth for all of the changes that happened in this release, head over to the Grafana Tempo 2.8 release notes or the changelog.

Read Post

Grafana

Read more about Grafana Tempo 2.8 release: memory improvements, new TraceQL features, and more

What's Slowing You Down? How Intelligent Operations Accelerate Business Transformation

Jun 12, 2025 By ScienceLogic In ScienceLogic

Your organization has a bold modernization roadmap. Cloud migration. Application updates. Enhanced customer experiences. New revenue streams. The business case is compelling, the stakeholders are aligned, and the budget is approved. Yet six months in, progress feels sluggish. The cloud migration is behind schedule due to performance issues no one anticipated. Application modernization stalled when the team discovered integration complexities that weren’t apparent during planning.

Read Post

ScienceLogic

Read more about What's Slowing You Down? How Intelligent Operations Accelerate Business Transformation

Getting started with HaloPSA dashboards

Jun 12, 2025 By Dan Watts In Squared Up

The HaloPSA plugin is a new addition to SquaredUp, and helps you create live dashboards that surface the important metrics – giving you and your team a single pane of glass for help desk performance, asset visibility, and client reporting. Why it matters: If your team uses HaloPSA to manage tickets, assets, and clients, then you already know how vital that data is for running smooth operations.

Read Post

Squared Up

Read more about Getting started with HaloPSA dashboards

Could your Palo Alto firewall do more to protect you against Shadow AI?

Jun 12, 2025 By Teneo In Teneo

In recent months, my conversations with fellow technology leaders have consistently revolved around two key themes: how we leverage AI to drive innovation and efficiency, and how we mitigate the inherent risks associated with AI. However, I’ve noticed a concerning gap – while enterprises are busy strategizing the adoption of AI to enhance productivity, reduce costs, and outpace competitors, very few are addressing how AI is being actively used today by their own teams.

Read Post

Teneo

Read more about Could your Palo Alto firewall do more to protect you against Shadow AI?

Fluent Bit Helm Chart: Simplify Log Collection in Kubernetes

Jun 12, 2025 By Anjali Udasi In Last9

Collecting logs in Kubernetes often starts as a simple goal, and quickly turns into a game of “where did that log line go?” Between sidecars, DaemonSets, and countless config options, it’s easy to get lost. Fluent Bit helps cut through the noise. It's fast, lightweight, and plays well with Kubernetes. And when you deploy it using Helm charts? The setup becomes way more manageable. This guide covers the how and the why, without overcomplicating the what.

Read Post

Last9

Read more about Fluent Bit Helm Chart: Simplify Log Collection in Kubernetes

Test and Monitor React/Next.js Apps with Playwright

Jun 12, 2025 By Checkly In Checkly

Watch the full recording of our in-depth webinar on how to test and monitor React and Next.js applications using Playwright. This session is packed with practical insights, real-world examples, and actionable advice to help developers automate testing and boost front-end performance monitoring.

View Video

Checkly

Read more about Test and Monitor React/Next.js Apps with Playwright

Boosting your AWS monitoring ROI: Strategies that deliver

Jun 12, 2025 By Sinjan Ballav In Site24x7

AWS gives you the power to scale, deploy, and innovate at speed. However, with that speed comes a good amount of complexity. Services multiply, resources balloon, and performance issues sneak in when you least expect them. That’s where monitoring comes in. But it isn’t about checking boxes on dashboards. It’s about getting the most value for every dollar you spend or, maximizing your return on investment (ROI) from AWS monitoring. So, how do you actually do that?

Read Post

Site24x7

Read more about Boosting your AWS monitoring ROI: Strategies that deliver

Beyond Storage: How Time Series Databases Are Becoming Intelligent Data Engines

Jun 12, 2025 By Allyson Boate In InfluxData

Data isn’t just a record of what happened—it shapes what happens next. Across industries, connected devices continuously stream time-stamped data that reflects the current state of machines, environments, and systems. This steady flow gives businesses a live view of their operations and the opportunity to catch issues early, adjust quickly, and operate more efficiently.

Read Post

InfluxData

Read more about Beyond Storage: How Time Series Databases Are Becoming Intelligent Data Engines

Lumigo Copilot AI Launches to Automate Root Cause Analysis and Remediation

Jun 12, 2025 By Erez Berkner In Lumigo

Today, we’re announcing the general availability of Lumigo Copilot, the most intelligent AI-powered observability assistant on the market, built for the complexities of modern microservices. Copilot emerged from a simple realization: Distributed systems produce too much fragmented data across too many layers, making troubleshooting slow, reactive, and deeply manual. Copilot changes that.

Read Post

Lumigo

Read more about Lumigo Copilot AI Launches to Automate Root Cause Analysis and Remediation

Ops Explained: AIOps vs. DevOps vs. MLOps vs. Agentic AIOps

Jun 12, 2025 By LogicMonitor In LogicMonitor

There’s a common misconception in IT operations that mastering DevOps, AIOps, or MLOps means you’re “fully modern.” But these aren’t checkpoints on a single journey to automation. DevOps, MLOps, and AIOps solve different problems for different teams—and they operate on different layers of the technology stack. They’re not stages of maturity. They’re parallel areas that sometimes interact, but serve separate needs.

Read Post

LogicMonitor

Read more about Ops Explained: AIOps vs. DevOps vs. MLOps vs. Agentic AIOps

The 1st Successful Commercial Moon Landing | Firefly's Blue Ghost Mission 1 | Grafana Everywhere

Jun 12, 2025 By Grafana In Grafana

Firefly’s Blue Ghost Mission One successfully landed on the moon with the help of Grafana. In this behind-the-scenes talk, learn how real-time dashboards powered critical decisions during descent, tracked payloads, and helped operators visualize everything from footpad sensors to lunar gravity. Footage and photos courtesy of Firefly Aerospace.

View Video

Grafana

Read more about The 1st Successful Commercial Moon Landing | Firefly's Blue Ghost Mission 1 | Grafana Everywhere

Elastic - The Search AI Company

Jun 12, 2025 By Elastic In Elastic

You may not know it, but you probably use Elastic every day. By combining the transformative power of AI with our deep expertise in search and vector databases, we are changing what's possible with search. Our Search AI Platform empowers organizations to have a conversation with all their data, build powerful GenAI applications, immediately diagnose root causes in observability, and hunt for threats at enterprise scale.

View Video

Elastic

Read more about Elastic - The Search AI Company

Top Five Reasons Telemetry Pipelines Should Be on Every Engineer's Radar

Jun 12, 2025 By Mezmo In Mezmo

You’ve probably felt the pain: data pouring in from every corner of your stack, tools choking on volume, dashboards lagging behind reality, alerts firing (or worse, not firing) without context. If that sounds familiar, it’s time to get serious about telemetry pipelines. Whether you're an SRE trying to stabilize a flapping service or a developer navigating multi-cloud chaos, a telemetry pipeline helps you take control of the data firehose.

Read Post

Mezmo

Read more about Top Five Reasons Telemetry Pipelines Should Be on Every Engineer's Radar

Datadog + OpenAI: Codex CLI integration for AIassisted DevOps

Jun 12, 2025 By Reilly Wood In Datadog

We are exploring how we can help on-call engineers troubleshoot incidents more effectively by providing the OpenAI Codex agent with access to real-time observability data in terminals. We've developed an integration and new tool visualizations that connect OpenAI's Codex CLI to the new Datadog MCP server. In this post, we'll share what we've been experimenting with: enabling an AI agent to retrieve production metrics, logs, and incidents from Datadog in real time and act on that context.

Read Post

Datadog

Read more about Datadog + OpenAI: Codex CLI integration for AIassisted DevOps

Beyond PHP-FPM: Modern PHP Application Servers

Jun 12, 2025 By Johannes Rauh In Icinga

For decades, PHP has powered the web using a simple model: process a request, send a response, then shut down. This model, especially in the form of CGI and PHP-FPM, is easy to understand but increasingly inefficient for modern web demands.

Read Post

Icinga

Read more about Beyond PHP-FPM: Modern PHP Application Servers

Atlassian Confluence Monitoring on Microsoft SCOM

Jun 11, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

As part of a customer project, we developed a custom Confluence Management Pack for Microsoft System Center Operations Manager (SCOM). This tailored solution enables IT operations teams to monitor key performance and health metrics of Confluence environments, ensuring knowledge-sharing platforms remain available and performant.

Read Post

NiCE IT Mgmt

Read more about Atlassian Confluence Monitoring on Microsoft SCOM

Sponsored Post

The Network-First Advantage: How Fabrix.ai Redefines Observability from the Ground Up

Jun 11, 2025 By Andrew Mallaband In Fabrix

Modern enterprises today often find themselves in a peculiar predicament: they are drowning in a deluge of telemetry data—including logs, metrics, and traces—yet paradoxically remain blind to what truly matters. Despite making substantial investments in observability tools, teams frequently find themselves reacting to incidents rather than proactively preventing them, with alerts flooding dashboards often devoid of critical context.

Read Post

Fabrix

Read more about The Network-First Advantage: How Fabrix.ai Redefines Observability from the Ground Up

Guided by Trust: ScienceLogic Earns TrustRadius Top Rated for the Sixth Year Running

Jun 11, 2025 By ScienceLogic In ScienceLogic

In a world where IT complexity is accelerating, trust has never been more essential. At ScienceLogic, trust isn’t just a value—it’s our compass. It guides how we innovate, how we serve, and how we grow alongside our customers. That’s why we’re proud to share that ScienceLogic SL1 has once again been named a Top Rated product on TrustRadius—for the sixth consecutive year. This recognition is more than a milestone.

Read Post

ScienceLogic

Read more about Guided by Trust: ScienceLogic Earns TrustRadius Top Rated for the Sixth Year Running

5 Ways to Optimize Your OpenSearch Cluster

Jun 11, 2025 By Jade Lassery In logz.io

OpenSearch is a powerful, scalable search and analytics engine that can do amazing things for logging, observability, and full-text search. But like any distributed system, it only performs well if you keep it properly tuned and healthy. Ignore it, and you risk slower queries, higher costs, and even data loss. Here are five practical tips to keep your OpenSearch cluster running smoothly and efficiently.

Read Post

logz.io

Read more about 5 Ways to Optimize Your OpenSearch Cluster

Is it the network... or the CDN?

Jun 11, 2025 By Catchpoint In Catchpoint

When performance issues strike, the finger pointing begins. But here's the catch: CDNs aren't just "someone else's responsibility." They directly impact the user experience, and if they're misbehaving, your network team will be the first to get the call. That’s why CDN monitoring is essential. CDNs are dynamic and performance can vary dramatically across regions, ISPs, or even end users. When something goes wrong, it looks like a network issue, unless you have visibility into CDN behavior.

View Video

Catchpoint

Read more about Is it the network... or the CDN?

How to Configure Lightweight Browser Tracing for Debugging at Scale

Jun 11, 2025 By Ben Coe In Sentry

Sentry’s auto-instrumentation, using BrowserTracing, is convenient. You can get interesting insights about your frontend application out-of-the-box, such as whether slow and failing API calls are hurting your user experience (summarized in Network Requests), or how your website stacks up against industry standards for performance (summarized in Web Vitals).

Read Post

Sentry

Read more about How to Configure Lightweight Browser Tracing for Debugging at Scale

Interactive Service Map

Jun 11, 2025 By Honeycomb In Honeycomb

Honeycomb Service Map is dynamically generated from traces. It shows all the connections between services, and you can narrow it down by any query-- for instance, see the services one particular user hit. And then, get to a trace with the full details. Service Map is a feature of Honeycomb Enterprise.

View Video

Honeycomb

Read more about Interactive Service Map

No Sandwich, No Security: What This Week's Lunch Taught Me About DNS Blind Spots

Jun 11, 2025 By Teneo In Teneo

Like many shoppers in the UK this week, I found myself staring at half-empty shelves in my local grocery store. In a small but frustrating twist, my usual sandwich, chicken mayo on malted bread, was nowhere to be found. The disruption wasn’t just about lunchtime preferences; it was part of a broader impact from cyberattacks that hit major UK retailers, including Co-op and Marks & Spencer.

Read Post

Teneo

Read more about No Sandwich, No Security: What This Week's Lunch Taught Me About DNS Blind Spots

An Easy Guide to Getting Started with Elastic APM

Jun 11, 2025 By Faiz Shaikh In Last9

Code in production will break. Maybe a request takes too long, maybe it fails quietly, or maybe it works fine one minute and falls over the next. Logs can help, sure—but they don’t always show the full picture, especially when performance issues are involved. Elastic APM gives you a clearer view. It traces what your application is doing from incoming requests to database queries and everything in between.

Read Post

Last9

Read more about An Easy Guide to Getting Started with Elastic APM

How to Create Business Dashboards with Snowflake and Grafana Cloud

Jun 11, 2025 By Grafana In Grafana

Ready to bring your Snowflake data into Grafana dashboards? In this Grafana Quickstart, Shawn Pitts walks through how to connect Snowflake to Grafana Cloud using the official plugin — available on all tiers, including Cloud Free.

View Video

Grafana

Read more about How to Create Business Dashboards with Snowflake and Grafana Cloud

The Architecture Loop: How Early Can We Decide Speed, Stack and Scale?

Jun 11, 2025 By Sarah Morgan In Scout

In 2025, many companies are reckoning with the true cost of microservices, especially as cloud bills grow and engineering teams face coordination fatigue. The move back to monoliths is gaining traction, particularly for startups and mid-sized businesses who need: ‍ At Scout APM, we’ve been thinking about these shifts not just from a monitoring perspective, but from a broader architectural one.

Read Post

Scout

Read more about The Architecture Loop: How Early Can We Decide Speed, Stack and Scale?

The best of both worlds with the Splunk Cloud Platform

Jun 11, 2025 By Splunk In Splunk

This video describes how the value of migrating to the Splunk Cloud Platform provides a comprehensive environment that offers everything from efficiency and sustainability to agility and security plus and lower your costs. How can you be sure? With the Splunk Cloud Calculator we’ll show you the real dollar savings you could get from migrating to the Splunk Cloud Platform.

View Video

Splunk

Read more about The best of both worlds with the Splunk Cloud Platform

We finally got rid of Bootstrap css! #coding #programming #webdesign #debugging #css

Jun 11, 2025 By TrackJS In TrackJS

Jordan from @Trackjs celebrates finally getting rid of bootstrap styles in the TrackJS redesign.

View Video

TrackJS

Read more about We finally got rid of Bootstrap css! #coding #programming #webdesign #debugging #css

The value of cloud is clear with Splunk

Jun 11, 2025 By Splunk In Splunk

Understand the value of moving to the Splunk Cloud Platform which can help you deliver more use cases faster and greatly reduce admin time so you can focus on gaining value to help you innovate faster with Splunk.

View Video

Splunk

Read more about The value of cloud is clear with Splunk

7 critical Active Directory metrics every IT admin should monitor

Jun 11, 2025 By Priya Praburam In ManageEngine

Across vast enterprise networks, Active Directory (AD) serves as the foundational layer for identity and access management. It's the critical service enabling user authentication, managing authorizations, and ensuring smooth operations across your network. Given its central role, any hiccup in AD can lead to widespread outages, security vulnerabilities, or frustrating user experiences.

Read Post

ManageEngine

Read more about 7 critical Active Directory metrics every IT admin should monitor

How To Visualize BigQuery Data in Grafana

Jun 11, 2025 By Grafana In Grafana

Ready to bring your BigQuery data into focus? In this Grafana Quickstart, Shawn Pitts walks through how to connect Google BigQuery to Grafana Cloud using the official plugin — available on all tiers, including Cloud Free.

View Video

Grafana

Read more about How To Visualize BigQuery Data in Grafana

Grafana Learning Journeys: How to Send Logs to Grafana Cloud Using Alloy

Jun 11, 2025 By Grafana In Grafana

In this Grafana Learning Journey supplementary video, Developer Advocate Marie Cruz shows how to send system logs to Loki in Grafana Cloud using Alloy.

View Video

Grafana

Read more about Grafana Learning Journeys: How to Send Logs to Grafana Cloud Using Alloy

Is It a Cup or a Pot? Helping You Pinpoint the Problem-and Sleep Through the Night

Jun 11, 2025 By Mezmo In Mezmo

It’s 3 AM. Your phone screams. You stumble to your laptop, eyes half-closed, wondering the same question every SRE has asked mid-incident.

Read Post

Mezmo

Read more about Is It a Cup or a Pot? Helping You Pinpoint the Problem-and Sleep Through the Night

Achieving Comprehensive Network Observability for VMware Cloud Foundation

Jun 11, 2025 By Nestor Falcon Gonzalez In Broadcom

Private cloud infrastructure adoption is accelerating rapidly. This move is driven by the ongoing “cloud reset” as leaders rethink their hybrid and multi-cloud strategies, seeking greater control, security, and flexibility for their IT workloads. As a matter of fact, leaders in 69% of organizations are considering repatriating workloads, and one-third already have.

Read Post

Broadcom

Read more about Achieving Comprehensive Network Observability for VMware Cloud Foundation

Data points per minute in Grafana Cloud: What you need to know about DPM

Jun 11, 2025 By Matt Wimpelberg In Grafana

If you’re working with metrics in Grafana Cloud, chances are you’ve come across DPM (data points per minute). It shows up in usage dashboards, invoice breakdowns, and occasionally pops up in Slack when your ingestion numbers start looking suspicious. DPM can also be seen in the Grafana Cloud billing and usage dashboard, which is available by default in every Grafana Cloud account. It helps you understand how much data you’re sending—and whether it’s more than you need.

Read Post

Grafana

Read more about Data points per minute in Grafana Cloud: What you need to know about DPM

Cisco and Splunk Strengthen Enterprise Digital Resilience in the AI Era

Jun 10, 2025 By Kamal Hathi In Splunk

In an era where hybrid environments and AI-driven innovations redefine enterprise operations, organizations face increasing complexity, disruption, and vulnerability in their systems. To overcome this growing challenge, Cisco and Splunk are working together to harness the power of AI to help customers ensure that digital resilience is an inherent part of their systems.

Read Post

Splunk

Read more about Cisco and Splunk Strengthen Enterprise Digital Resilience in the AI Era

Yes, Sentry has an MCP Server (...and it's pretty good)

Jun 10, 2025 By Cody De Arkland In Sentry

Unless you’ve been living under a rock, “MCP” is probably a term you’ve heard thrown around in the AI space. Each of the editors and LLM providers have been racing to add and enhance their MCP support. Sentry was fortunate enough to be included in Anthropics release announcements for MCP.

Read Post

Sentry

Read more about Yes, Sentry has an MCP Server (...and it's pretty good)

Implementing Grafana Play privacy policies with Grafana k6: A behind-the-scenes look

Jun 10, 2025 By Marie Cruz In Grafana

Grafana Play is a free and publicly accessible sandbox environment that allows users to explore and learn Grafana without setting up their own instance. Grafana Play comes preloaded with ready-made sample dashboards, and showcases how to work with different data sources, create visualizations, and use advanced Grafana features.

Read Post

Grafana

Read more about Implementing Grafana Play privacy policies with Grafana k6: A behind-the-scenes look

7 proven ways to speed up your website

Jun 10, 2025 By ManageEngine Site24x7 In Site24x7

A slow-loading website is costly. Just a few extra seconds can drive users away and increase bounce rates. In this video, we’ll walk you through seven proven techniques to improve your website’s load speed.

View Video

Site24x7

Monitoring

Read more about 7 proven ways to speed up your website

Getting OpenTelemetry Data Into Graylog

Jun 10, 2025 By Jeff Darrington In Graylog

OpenTelemetry is emerging as the common framework for collecting observability data, and for good reason. It’s vendor-neutral, open source, and designed to collect traces, metrics, and logs in a consistent way. But while most of the buzz is around tracing and metrics, let’s not forget: logs are still the backbone of investigation and response. That’s why Graylog now supports native collection of OpenTelemetry data over gRPC.

Read Post

Graylog

Read more about Getting OpenTelemetry Data Into Graylog

The truth you can't afford to miss: Listen as your logs spill the tea

Jun 10, 2025 By Merylee Heggem In Sumo Logic

When you hear “spill the tea,” you probably think of pop culture, not outages or anomalies. But the origin may surprise you: before it was slang for juicy gossip, ‘tea’ was actually ‘T,’ which represents truth. We know what you’re thinking: “Are you trying to say ‘spilling the tea’ is a good thing?” And yes, that’s exactly what we’re saying, especially when your logs are doing the talking.

Read Post

Sumo Logic

Read more about The truth you can't afford to miss: Listen as your logs spill the tea

Why companies keep migrating to Coralogix

Jun 10, 2025 By Ofri Grushka & Mayur Moon In Coralogix

As businesses scale, so do their observability needs, but many find themselves stuck with costly, inflexible platforms that no longer serve them. Despite mounting frustrations, the complexity of migration keeps companies from making a change. The risk of losing critical data, disrupting workflows, or rebuilding everything from scratch often outweighs the benefits of switching. Most vendors offer little to no migration support, forcing teams to manually reconfigure dashboards, alerts, and integrations.

Read Post

Coralogix

Read more about Why companies keep migrating to Coralogix

Accelerate Oracle Cloud Infrastructure monitoring with Datadog OCI QuickStart

Jun 10, 2025 By Natalie Wilkinson In Datadog

Datadog’s Oracle Cloud Infrastructure integration enables you to collect metrics and logs from your entire OCI stack and monitor them within a single platform alongside other third-party technologies. Datadog’s new OCI QuickStart is a fully managed, single-flow setup experience that helps you monitor your OCI infrastructure and applications in just a few clicks.

Read Post

Datadog

Read more about Accelerate Oracle Cloud Infrastructure monitoring with Datadog OCI QuickStart

Create and monitor LLM experiments with Datadog

Jun 10, 2025 By Tom Sobolik In Datadog

To efficiently optimize your LLM application before pushing to production, you need a comprehensive testing and evaluation framework. By running experiments, you can optimize prompts, fine-tune temperature and other key parameters, test complex agent architectures, and understand how your application may respond to atypical, complex, or adversarial inputs. However, it can be difficult to manage your experiment runs and aggregate the results for meaningful analysis.

Read Post

Datadog

Read more about Create and monitor LLM experiments with Datadog

Integrations made easy with VictoriaMetrics Cloud

Jun 10, 2025 By Jose Gomez-Selles In VictoriaMetrics

VictoriaMetrics Cloud continues to evolve as the most efficient, scalable and open platform in the observability landscape. In our last Q1 update blogpost, we shared new features such as seamless OpenTelemetry integrations, new Organizations support, and improvements in the Explore UI and APIs. This time we wanted to take a minute to showcase how we’re taking the interoperability journey very seriously. Integrations in VictoriaMetrics Cloud Haven’t tried VictoriaMetrics Cloud yet?

Read Post

VictoriaMetrics

Read more about Integrations made easy with VictoriaMetrics Cloud

DASH by Datadog 2025 Keynote

Jun 10, 2025 By Datadog In Datadog

At the 2025 DASH Keynote and be the first to experience Datadog's latest product innovations. This year, we're unveiling next-generation observability features, innovative ways to secure your AI workloads, and powerful agentic AI capabilities throughout the Datadog platform. Discover the new ways your teams can observe, secure, and act in the age of AI.

View Video

Datadog

Read more about DASH by Datadog 2025 Keynote

CI/CD Observability with OpenTelemetry - A Step by Step Guide

Jun 10, 2025 By Elizabeth Mathew In SigNoz

In the fast-paced world of CI/CD, understanding the performance and behaviour of your pipelines is crucial. GitHub Actions has become a popular choice for automating builds and deployments, but anyone who's debugged a flaky workflow or long-running job knows how challenging it can be to get visibility into what's happening under the hood. We usually rely on build logs, timing data, or guesswork when something goes wrong.

Read Post

SigNoz

Read more about CI/CD Observability with OpenTelemetry - A Step by Step Guide

The Mindset Shift: IT Operations to Security - SolarWinds TechPod 099

Jun 10, 2025 By solarwindsinc In SolarWinds

In this episode, hosts Sean Sebring and Chrystal Taylor engage with actual rock star Chris Greer, a Security Engineering Manager at SolarWinds, to explore the multifaceted world of cybersecurity. Chris shares his unconventional journey from being a musician to entering the IT field, emphasizing the importance of certifications and the mindset shift required when transitioning from IT operations to security.

View Video

SolarWinds

Read more about The Mindset Shift: IT Operations to Security - SolarWinds TechPod 099

Built for Impact: What Happens When LogicMonitor Edwin AI Meets Infosys AIOps Insights

Jun 10, 2025 By LogicMonitor In LogicMonitor

Today’s IT environments span legacy infrastructure, multiple cloud platforms, and edge systems—each producing fragmented data, inconsistent signals, and hidden points of failure. This scale brings opportunity, but also operational strain: fragmented visibility, overwhelming alert noise, and slower time to resolution. With good reason, public and private sector organizations alike are moving beyond basic visibility, demanding hybrid observability that’s context-aware and action-oriented.

Read Post

LogicMonitor

Read more about Built for Impact: What Happens When LogicMonitor Edwin AI Meets Infosys AIOps Insights

Introducing Bits AI SRE, your AI on-call teammate

Jun 10, 2025 By Kai Xin Tai In Datadog

Getting paged pulls engineers away from meaningful work, yet incident response in many organizations remains manual, reactive, and draining. An alert fires and teams scramble to find the root cause, relying on siloed knowledge, incomplete context, and a few on-call experts who are already stretched thin. The rise of AI coding agents has only intensified this challenge: As teams ship code faster with less human oversight, production systems grow increasingly complex and harder to understand.

Read Post

Datadog

Read more about Introducing Bits AI SRE, your AI on-call teammate

How To: SLA Monitoring & Reporting: Are You Getting What You Paid For?

Jun 10, 2025 By Alyssa Lamberti In Obkio

Are you tired of feeling like you're in the dark about the services you're paying for? Are you getting what you paid for? Many businesses are in the same boat when it comes to Service Level Agreement (SLA) monitoring and reporting. It’s great having an SLA (or Service-Level Agreement) for the provision of a service, but you need to go further to really understand if the standards specified in the SLA are actually being met. That’s where SLA monitoring and reporting comes in.

Read Post

Obkio

Read more about How To: SLA Monitoring & Reporting: Are You Getting What You Paid For?

Moving from Relational to Time Series Databases

Jun 10, 2025 By Heather Downing In InfluxData

I’ve been building apps with SQL Server for years. Everything worked well until I started dealing with sensor data, stock trade volume, and IoT telemetry. As the volume of time-stamped records grew into the millions, I saw relational databases struggling with workloads they weren’t designed for. That’s when I explored time series databases. The performance improvements were significant, but what surprised me was the mental shift required.

Read Post

InfluxData

Read more about Moving from Relational to Time Series Databases

Datadog MCP Server: Connect your AI agents to Datadog tools and context

Jun 10, 2025 By Bowen Chen In Datadog

As development teams adopt AI-powered tools and build services that make use of AI agents, they want to extend their AI capabilities to incorporate familiar tools and observability data. However, AI agents struggle with regular API endpoints and frequently fail when parsing complex nested JSON hierarchies or incorrectly handling errors. As a result, these agents often fail to retrieve relevant results.

Read Post

Datadog

Read more about Datadog MCP Server: Connect your AI agents to Datadog tools and context

Optimize and troubleshoot AI infrastructure with Datadog GPU Monitoring

Jun 10, 2025 By Anjali Thatte In Datadog

As organizations bring more AI and LLM workloads into production, the underlying GPU infrastructure that supports these workloads becomes even more critical in ensuring these workloads remain fast, reliable, and scalable. Inefficient GPU resource usage, for instance, can lead to longer runtimes and reduced throughput, negatively impacting overall model performance. Additionally, idle and underutilized GPUs can quickly drive up costs and lead to needless spending.

Read Post

Datadog

Read more about Optimize and troubleshoot AI infrastructure with Datadog GPU Monitoring

How to Monitor Kafka Producer Metrics

Jun 10, 2025 By Anjali Udasi In Last9

Your Kafka producer pushed a million messages yesterday. Nice. But can you tell if they all made it? Or why did latency spike at 2 PM? Producer metrics help you determine that. They expose how long messages take to send, whether messages are getting stuck, and whether retries are piling up. Let’s go over which ones help while debugging and how to monitor them.

Read Post

Last9

Read more about How to Monitor Kafka Producer Metrics

What Is an IP Calculator and How to Use It for Efficient Network Management

Jun 10, 2025 By Olivia Díaz In Pandora FMS

Discover what an IP calculator is and how it helps you plan subnets, IP ranges, and addresses within IT networks. Ideal for system administrators.

Read Post

Pandora FMS

Read more about What Is an IP Calculator and How to Use It for Efficient Network Management

Automatically identify issues and generate fixes with Bits AI Dev

Jun 10, 2025 By Mike Leach In Datadog

Developers lose hours each week to a familiar troubleshooting loop: chase down telemetry across dashboards, decipher vague errors, and juggle alerts to find the signal worth fixing. Production issues, performance regressions, and security vulnerabilities all demand attention, but they often come with little context for taking action.

Read Post

Datadog

Read more about Automatically identify issues and generate fixes with Bits AI Dev

Improve performance and reliability with Proactive App Recommendations

Jun 10, 2025 By Yoann Robin In Datadog

As your organization grows, you may operate in increasingly complex environments and manage more services and larger teams to maintain them. Evolution like this can lead to an explosion of telemetry data from across your stack, including metrics, traces, logs, and frontend interactions. The benefit of greater visibility is often outweighed by the challenge of acting on the data you collect, and you can easily fall behind on implementing the fixes your services require to operate reliably and efficiently.

Read Post

Datadog

Read more about Improve performance and reliability with Proactive App Recommendations

Ensure trust across the entire data life cycle with Datadog Data Observability

Jun 10, 2025 By Nicholas Thomson In Datadog

As data systems grow more complex and data becomes even more business-critical, teams struggle to detect and resolve issues that impact data quality, reliability, and, ultimately, trust. Engineers have to rely on manual checks and ad hoc SQL queries to catch data quality issues—often after teams relying on the data have noticed something has gone wrong.

Read Post

Datadog

Read more about Ensure trust across the entire data life cycle with Datadog Data Observability

Introducing Seer: Sentry's AI Debugging Agent

Jun 10, 2025 By Sentry In Sentry

There's a lot more context to an error than the message blinking in red on your screen. Seer understands the context of your application and everything behind that error. Seer collects information from the Stack Trace, Logs, Traces and Spans, Profiles, and the code from your GitHub repo and uses it to understand what's causing your issues, and propose fixes.

View Video

Sentry

Read more about Introducing Seer: Sentry's AI Debugging Agent

How AI Roll-Ups & SASE Can Double Profits Overnight

Jun 9, 2025 By Teneo In Teneo

Last night, over dinner in Canary Wharf, I found myself catching up with a former colleague who is now leading investments at one of the big banks.

Read Post

Teneo

Read more about How AI Roll-Ups & SASE Can Double Profits Overnight

How IPM helped a top tech brand catch an OpenAI outage before it became a crisis

Jun 9, 2025 By Brian Costain In Catchpoint

Today’s digital businesses are more interconnected than ever. Industry research shows that 74% of organizations now take an “API-first” approach, and the average application is powered by between 26 and 50 APIs. While this accelerates innovation, it also introduces new risks: when an external provider fails, the impact can be immediate and far-reaching.

Read Post

Catchpoint

Read more about How IPM helped a top tech brand catch an OpenAI outage before it became a crisis

Top 15 Distributed Tracing Tools for Microservices in 2025

Jun 9, 2025 By Ankit Anand In SigNoz

In one of our previous blogs, we discussed distributed tracing in depth. We examined why distributed tracing is critical and its components - spans and trace context. You can check the complete guide here: What is Distributed Tracing and How to Implement it with Open Source? Here, we'll look at some of the best distributed tracing tools. We'll see what each of them offers so that you can choose the right tool for your monitoring and observability requirements.

Read Post

SigNoz

Read more about Top 15 Distributed Tracing Tools for Microservices in 2025

Top 13 Open Source APM Tools [2025 Guide]

Jun 9, 2025 By Ankit Anand In SigNoz

Choosing the right APM tool is critical. How do you know which is the right one for you? Here are the top 13 open-source application performance monitoring(APM) tools that can solve your monitoring needs. Open-source APM tools have added benefits over their SaaS counterparts. They are more transparent, as you can verify their source code, and you can use them without going through the pains of obtaining approvals usually required for using a third-party vendor tool.

Read Post

SigNoz

Read more about Top 13 Open Source APM Tools [2025 Guide]

How to Collect .NET Application Logs with OpenTelemetry

Jun 9, 2025 By Abhishek Policepatil In SigNoz

Observability is essential for maintaining and scaling modern applications. With.NET 8, Microsoft has enhanced support for observability using OpenTelemetry. In this post, we explore how to monitor.NET 8 applications logs with SigNoz, an open-source observability platform, using the OpenTelemetry Protocol (OTLP) exporter.

Read Post

SigNoz

Read more about How to Collect .NET Application Logs with OpenTelemetry

Auto-Instrument Everything with eBPF: Grafana Beyla + OpenTelemetry in Action | Homelabs

Jun 9, 2025 By Grafana In Grafana

Grafana Beyla is a powerful eBPF-based auto-instrumentation tool for application and network observability. In this session, see how Beyla captures RED metrics and traces with zero code changes, and how it fits into the OpenTelemetry ecosystem. Perfect session for SREs, devs, and home labbers alike.

View Video

Grafana

Read more about Auto-Instrument Everything with eBPF: Grafana Beyla + OpenTelemetry in Action | Homelabs

An Autonomous Ship is Set to Circumnavigate the World Using Docker, Grafana, & Starlink: Project Bob

Jun 9, 2025 By Grafana In Grafana

Join Andrew McCalip of Varda Space Industries as he builds Project Bob—a DIY, solar-powered, autonomous ship aiming to circumnavigate the globe using open source tools like Grafana, Raspberry Pi, and Starlink.

View Video

Grafana

Read more about An Autonomous Ship is Set to Circumnavigate the World Using Docker, Grafana, & Starlink: Project Bob

You Can Build Your Own AI Agent for ITOps-But Should You?

Jun 9, 2025 By LogicMonitor In LogicMonitor

Most internal AI projects for IT operations next exit pilot. Budgets stretch, priorities shift, key hires fall through, and what started as a strategic initiative turns into a maintenance burden—or worse, shelfware. Not because the teams lacked vision. But because building a production-grade AI agent is an open-ended commitment. It’s not just model tuning or pipeline orchestration. It’s everything: architecture, integrations, testing frameworks, feedback loops, governance, compliance.

Read Post

LogicMonitor

Read more about You Can Build Your Own AI Agent for ITOps-But Should You?

Smarter Telemetry Pipelines: The Key to Cutting Datadog Costs and Observability Chaos

Jun 9, 2025 By Mezmo In Mezmo

Log volume is exploding, costs are rising, and most teams are stuck duct-taping together short-term fixes. During our webinar, "Optimizing Log Management in Datadog: Cut Costs Without Losing Insights," we discuss how DevOps and engineering leaders are navigating the growing pains of observability, especially in environments where tools like Datadog are mission-critical but challenging to manage. Here’s a recap of the key takeaways.

Read Post

Mezmo

Read more about Smarter Telemetry Pipelines: The Key to Cutting Datadog Costs and Observability Chaos

Migrate historical logs from Splunk and Elasticsearch using Observability Pipelines

Jun 9, 2025 By Micah Kim In Datadog

Migrating to a new logging platform can be a complex operation, especially when it involves both active and historical logs. Observability Pipelines offers dual-shipping capability, making it easy to route active logs to your new platform without disrupting your log management workflows. But migrating years worth of historical logs—which are critical for investigating security incidents and demonstrating compliance with applicable laws—requires a different approach.

Read Post

Datadog

Read more about Migrate historical logs from Splunk and Elasticsearch using Observability Pipelines

It's The End Of Observability As We Know It (And I Feel Fine)

Jun 9, 2025 By Austin Parker In Honeycomb

In a really broad sense, the history of observability tools over the past couple of decades have been about a pretty simple concept: how do we make terabytes of heterogeneous telemetry data comprehensible to human beings? New Relic did this for the Rails revolution, Datadog did it for the rise of AWS, and Honeycomb led the way for OpenTelemetry.

Read Post

Honeycomb

Read more about It's The End Of Observability As We Know It (And I Feel Fine)

Mastering NodeJS Performance Monitoring - A Practical Guide using Open Source Tools

Jun 9, 2025 By Ankit Anand In SigNoz

Node.js powers some of the fastest-growing web applications, but its single-threaded nature makes it vulnerable to memory leaks and CPU spikes. To keep your app running smoothly, especially in production, you need more than just web server logs — you need complete visibility across the entire stack.

Read Post

SigNoz

Read more about Mastering NodeJS Performance Monitoring - A Practical Guide using Open Source Tools

How to Integrate OpenTelemetry Collector with Prometheus

Jun 9, 2025 By Prathamesh Sonpatki In Last9

Pulling observability data together is rarely clean. Metrics come from everywhere, formats vary, and making sense of it takes some work. OpenTelemetry Collector and Prometheus fit perfectly here. The Collector handles ingestion and processing from different sources, while Prometheus stores and queries the data. Simple, effective, and no vendor lock-in. In this blog, we cover how to integrate the Collector with Prometheus, common pitfalls, and ways to control costs.

Read Post

Last9

Read more about How to Integrate OpenTelemetry Collector with Prometheus

A Complete Guide to Linux Log File Locations and Their Usage

Jun 9, 2025 By Anjali Udasi In Last9

Linux log files are text-based records that capture system events, application activities, and user actions. They're stored primarily in the /var/log directory and provide essential information for debugging issues, monitoring system health, and maintaining security. This guide covers the most important Linux log files and a few detailed techniques for reading and analyzing them.

Read Post

Last9

Read more about A Complete Guide to Linux Log File Locations and Their Usage

Site24x7: Synthetic monitoring vs. Real user monitoring

Jun 9, 2025 By ManageEngine Site24x7 In Site24x7

Want to know the difference between synthetic monitoring and real user monitoring (RUM)? You're not alone. In this video, we break down both monitoring types, show how they work, and explain when to use each—so you can build a monitoring strategy that gives you full visibility into your website or application performance. Here’s what you’ll learn: Whether you're a DevOps engineer, SRE, or IT admin, this video will help you make smarter monitoring decisions.

View Video

Site24x7

Read more about Site24x7: Synthetic monitoring vs. Real user monitoring

Lunar-level observability: How Firefly Aerospace used Grafana to monitor its historic moon landing

Jun 9, 2025 By Kristin Knapp In Grafana

On March 2, 2025, Firefly Aerospace made history. The company — a space services firm that offers safe, reliable, and economical access to space — completed the first fully successful lunar landing by a commercial provider with its Blue Ghost Mission 1. But behind the headlines and highlight reels was a team of dedicated engineers, years of preparation, and a mission control center outfitted with Grafana dashboards.

Read Post

Grafana

Read more about Lunar-level observability: How Firefly Aerospace used Grafana to monitor its historic moon landing

Successful Launch - Then Came The Problems

Jun 9, 2025 By Percepio In Percepio

About 15 years ago, I worked at a company building network security appliances (with ARM-based network processors) and was responsible for the development of custom Linux firmware. The product launch was successful; we shipped and managed a large fleet of devices in the field. After a few firmware releases, we received alerts from the device management system telling us that there were intermittent problems. Remoted into the appliances but could not reproduce the error.

Read Post

Percepio

Read more about Successful Launch - Then Came The Problems

Top 5 Open Source Log Management Tools (and How to Choose the Right One)

Jun 8, 2025 By Jade Lassery In logz.io

Managing logs at scale is no longer just about storing text—it’s about gaining insights fast, keeping systems healthy, and troubleshooting in real time. With cloud-native architectures becoming the norm, the pressure is on for modern teams to adopt log management tools that are fast, scalable, and easy to use. But with so many options, how do you choose the right one?

Read Post

logz.io

Read more about Top 5 Open Source Log Management Tools (and How to Choose the Right One)

The One Where We Show You Copilot Editor

Jun 7, 2025 By Cribl In Cribl

Copilot Editor is like an AI-powered Rosetta Stone for telemetry. It helps Cribl users take raw, messy telemetry data and turn it into standardized, analytics-ready formats. The most important piece? It puts YOU in control. Our human-in-the-loop design means that users have full control over and visibility into what’s happening with their critical data, preventing AI-induced mistakes. Watch this fun demo with the AI product team to show Copilot Editor's true value to the average Cribl user!

View Video

Cribl

Read more about The One Where We Show You Copilot Editor

IPL: How to create lists with ipl-web

Jun 6, 2025 By Sukhwinder Dhillon In Icinga

In my previous blog post, I explained how to build lists using ipl-web widgets. That method will soon be deprecated due to its complexity. With the recent ipl-web release, we have introduced a simpler and more flexible approach to building lists, using a lightweight rendering interface and a single class, as described below.

Read Post

Icinga

Read more about IPL: How to create lists with ipl-web

Fluentd vs Logstash: In-Depth Comparison of Two Popular Log Collectors 2025

Jun 6, 2025 By Pavithra Parthiban In Atatus

In modern observability stacks, log collection is a critical component. Among the most widely adopted logs collector are Fluentd and Logstash. Both tools are designed to collect, process, and forward logs to various destinations like Elasticsearch, Kafka, and cloud services. However, the differences between FluentD and Logstash lie significantly in their design, performance, plugin ecosystems, and user experiences.

Read Post

Atatus

Read more about Fluentd vs Logstash: In-Depth Comparison of Two Popular Log Collectors 2025

MCP = Observability + Code, a Real-life Example

Jun 6, 2025 By Honeycomb In Honeycomb

Our bot is hitting an error. We can see it in the distributed trace. Here, see what happened when we noticed it: Austin fired up Claude Code (hooked up to Honeycomb with its MCP tool) and got it to find the error, fix it, deploy, and check that the fix worked. It got a little overconfident at first, but the ending is happy. IRL this took 22 minutes; the video speeds up the AI agent interactions and cuts out waiting. This video includes Austin Parker, Jessica Kerr, and Ken Rimple.

View Video

Honeycomb

Read more about MCP = Observability + Code, a Real-life Example

vmagent: Key Features Explained in Under 15 Minutes

Jun 6, 2025 By Phuong Le In VictoriaMetrics

This discussion is a part of the Basic Series serving as the starting point to quickly get you started with VictoriaMetrics. vmagent is a lightweight metrics collection agent that acts as a bridge between your applications and monitoring storage systems like VictoriaMetrics.

Read Post

VictoriaMetrics

Read more about vmagent: Key Features Explained in Under 15 Minutes

DX Operational Observability: Five New, Powerful Capabilities

Jun 6, 2025 By Pramit Saxena In Broadcom

DX Operational Observability (DX O2), our next-gen AIOps and Observability product, continues to provide new features and enhancements for practitioners across IT. DX O2 delivers a host of enhancements designed to empower IT operations, DevOps, and SRE teams. In this post, I introduce five powerful enhancements, outline steps to get started, and describe some of the benefits, which include deeper insights, improved efficiencies, and a more unified observability experience. Here are the five enhancements.

Read Post

Broadcom

Read more about DX Operational Observability: Five New, Powerful Capabilities

Create rich, up-to-date visualizations of your AWS infrastructure with Cloudcraft in Datadog

Jun 6, 2025 By Jace Harker In Datadog

As your cloud environment grows more complex and dynamic, it becomes more difficult to maintain up-to-date reference diagrams, visualizing its components, that are available to all teams. As a result, teams often end up lacking the visibility they need to understand, manage, and troubleshoot their cloud infrastructure and applications.

Read Post

Datadog

Read more about Create rich, up-to-date visualizations of your AWS infrastructure with Cloudcraft in Datadog

Database observability: How OpenTelemetry semantic conventions improve consistency across signals

Jun 6, 2025 By Marylia Gutierrez In Grafana

Databases are a crucial part of modern systems, which means database observability is incredibly important, too. However, gathering information on them can be complex, variable, and tricky to instrument in a consistent way. OpenTelemetry is helping to change that, and one of the most important aspects in making it work is a set of shared rules called semantic conventions.

Read Post

Grafana

Read more about Database observability: How OpenTelemetry semantic conventions improve consistency across signals

Top Features of Splunk Observability Cloud for Engineers

Jun 6, 2025 By Splunk In Splunk

In this video we’ll walk you through a demonstration of Splunk Observability Cloud’s key capabilities. You’ll see how you can monitor Kubernetes cluster health in Infrastructure Monitoring, and alert on your services’ health using AutoDetect Detectors and Alerts. We’ll then take a look at traces and metrics in APM, and use Related Content to find correlated log entries of error traces. Then we’ll use AlwaysOn Profiling to troubleshoot long duration traces for our service.

View Video

Splunk

Read more about Top Features of Splunk Observability Cloud for Engineers

Introducing Network Intelligence from Kentik

Jun 6, 2025 By Kentik In Kentik

Introducing the Kentik Network Intelligence Platform, a real time, holistic source of network truth built for modern infrastructure teams. Take the hard work out of running your network.

View Video

Kentik

Read more about Introducing Network Intelligence from Kentik

Introducing the Kentik Network Intelligence Platform

Jun 6, 2025 By Kentik In Kentik

Introducing the Kentik Network Intelligence Platform, a real time, holistic source of network truth built for modern infrastructure teams. Take the hard work out of running your network.

View Video

Kentik

Read more about Introducing the Kentik Network Intelligence Platform

Monitoring for Financial Services: Reducing Costs, Ensuring Reliability

Jun 6, 2025 By Sara Miteva In Checkly

Fintech has reshaped financial services, using technologies like machine learning and blockchain to deliver faster, smarter, more user-friendly experiences. Challenger banks, open banking apps, digital payments, and investment apps have set a new standard—leaving traditional institutions racing to keep up. But staying competitive isn’t just about building digital products—it’s about making them reliable.

Read Post

Checkly

Read more about Monitoring for Financial Services: Reducing Costs, Ensuring Reliability

Easy Method for Monitoring MinIO Performance Using Telegraf

Jun 6, 2025 By Benjamin Pitts In MetricFire

MinIO is a high-performance, S3-compatible object storage server built for cloud-native applications. It’s open-source, lightweight, and incredibly fast which makes it a solution for developers who need to store and serve unstructured data like images, logs, or backups. Whether you’re building a self-hosted alternative to Amazon S3 or running MinIO as part of a local development pipeline, it fits into modern containerized environments.

Read Post

MetricFire

Read more about Easy Method for Monitoring MinIO Performance Using Telegraf

What's Inside InfluxDB 3.1

Jun 6, 2025 By InfluxData In InfluxData

InfluxDB 3.1 is now available for both Core and Enterprise editions, bringing significant improvements that make managing high-volume, high-velocity time series data even easier, faster, and more secure. InfluxDB 3 Core is the free, open source edition of InfluxDB 3—a high-speed, recent-data engine licensed under MIT and Apache 2. InfluxDB 3 Enterprise is the commercial version of Core, adding support for longer-term historical queries, high availability, enhanced security, and more.

View Video

InfluxData

Read more about What's Inside InfluxDB 3.1

Where creativity and code collide: Top Down with Adhish Thite

Jun 6, 2025 By Jeanetta Clement In Elastic

Inside an AI engineer’s creative workspace.

Read Post

Elastic

Read more about Where creativity and code collide: Top Down with Adhish Thite

The Brain Behind the Pings: Understanding the Pingmesh Control Plane

Jun 6, 2025 By Rudra Rugge In Selector

In today’s interconnected world, a fundamental question plagues every network administrator and SRE: “Is my network running well?” The answer, often elusive, is precisely what Pingmesh aims to provide. By deploying a vast fleet of specialized probe agents, Pingmesh continuously monitors critical network health metrics, including latency, packet loss, jitter, and custom reachability checks, providing an unparalleled view into your network’s performance.

Read Post

Selector

Read more about The Brain Behind the Pings: Understanding the Pingmesh Control Plane

Getting Started with InfluxDB 3 Enterprise

Jun 6, 2025 By InfluxData In InfluxData

InfluxDB 3 Enterprise is built on a cloud-native, diskless architecture that removes the limits of traditional storage. It’s easy to deploy, scales effortlessly, and cuts out the complexity of managing clusters. Its stateless design keeps operations simple and adapts to any environment.

View Video

InfluxData

Read more about Getting Started with InfluxDB 3 Enterprise

Monitoring ECS Metrics: A Guide for Developers and Operations Teams

Jun 6, 2025 By Krish Chandra In eG Innovations

For anyone leveraging cloud computing, Amazon Elastic Container Service (ECS) continues to provide a seamless solution for managing containerized applications. AWS Fargate takes this cloud-native architecture a step further by allowing you to run containers without servers or clusters. As a serverless offering for ECS, Fargate provisions compute capacity and scales it based on demand.

Read Post

eG Innovations

Read more about Monitoring ECS Metrics: A Guide for Developers and Operations Teams

From Downtime to Uptime: Monitoring Tools and Techniques for Systems, Websites, APIs, and More

Jun 6, 2025 By Joseph Nduhiu In Splunk

Recently, while visiting a friend in a local hospital, I found myself facing a frustrating distraction: trying to pay parking fees using USSD (a mobile text-based system for quick transactions). The service was either painfully slow or not working at all. I wasn’t alone. Other visitors were just as exasperated, and parking attendants stood idle, their handheld devices frozen in endless loading loops.

Read Post

Splunk

Read more about From Downtime to Uptime: Monitoring Tools and Techniques for Systems, Websites, APIs, and More

AI + Dark Mode: Introducing AI-Powered Insights and The Long Awaited Dark Mode

Jun 6, 2025 By Adnan Rahic In ObservIQ

Join the live stream at 11 am ET, here. Launch Week’s Friday drop delivers two of the most-requested upgrades we’ve ever shipped: Together, they turn Bindplane into a cooler , and smarter , place to manage observability and SecOps telemetry. A full suite of extensive AI features will be rolling out over the coming weeks. This is just the beginning!

Read Post

ObservIQ

Read more about AI + Dark Mode: Introducing AI-Powered Insights and The Long Awaited Dark Mode

Digitate Advances Agentic AI Platform to Accelerate Enterprises Towards Autonomous, Ticketless IT and Business Operations

Jun 5, 2025 By Digitate In Digitate

New release launches purpose-built AI agents for IT and SREs as well as AI Assistants for CIOs to tackle complex enterprise tasks.

Read Post

Digitate

Read more about Digitate Advances Agentic AI Platform to Accelerate Enterprises Towards Autonomous, Ticketless IT and Business Operations

Don't rewrite your login for every Playwright Test!

Jun 5, 2025 By Checkly In Checkly

Find yourself writing the steps of logging in every single Playwright test? Instead of going through the login sequence, we can take a snapshot of the browser of all the cookies and everything and just log in once.

View Video

Checkly

Read more about Don't rewrite your login for every Playwright Test!

Why did my Playwright test fail? Analyzing Traces #playwright #checkly

Jun 5, 2025 By Checkly In Checkly

We all check the video first, let's go further to find out why our Playwright test failed.

View Video

Checkly

Read more about Why did my Playwright test fail? Analyzing Traces #playwright #checkly

3 Reasons Why You Should Use Custom Playwright Fixtures

Jun 5, 2025 By Checkly In Checkly

In this video, Stefan Judis, Playwright ambassador, explains the power of Playwright fixtures while running tests in JavaScript or TypeScript. Learn how to streamline your test setup, remove repeated code, and leverage custom fixtures for cleaner and more efficient end-to-end tests. By the end of this video, you'll have a clear understanding of why you should use Playwright's native architecture to structure your testing project.

View Video

Checkly

Read more about 3 Reasons Why You Should Use Custom Playwright Fixtures

Working with GPUs on Kubernetes and making them observable

Jun 5, 2025 By Nikolay Sivko In Coroot

GPUs are everywhere powering LLM inference, model training, video processing, and more. Kubernetes is often where these workloads run. But using GPUs in Kubernetes isn’t as simple as using CPUs. You need the right setup. You need efficient scheduling. And most importantly you need visibility. This post walks through how to run GPU workloads on Kubernetes, how to virtualize them efficiently, and how Coroot helps you monitor everything with zero instrumentation or config.

Read Post

Coroot

Read more about Working with GPUs on Kubernetes and making them observable

Why You Need Real User Monitoring to Really Understand Your Web Performance

Jun 5, 2025 By Todd H. Gardner In Request Metrics

Great Lighthouse scores, but your site is still slow. Sound familiar? You’ve run PageSpeed Insights, Request Metrics, and every other synthetic test you can find. Your scores look great. But your analytics shows users bouncing, conversions dropping, and complaints about “slow pages.” What’s going on? The answer is simple: synthetic testing only tells you how your site performs in a test, not how it performs for real users in the real world.

Read Post

Request Metrics

Read more about Why You Need Real User Monitoring to Really Understand Your Web Performance

Why Cribl Copilot Editor is Built for the Human, First and Foremost

Jun 5, 2025 By Ledion Bitincka In Cribl

I’m genuinely excited about what we're rolling out with Copilot Editor, an update to our AI that’s truly packed with new capabilities designed to help you automate pipeline development. You can read about these capabilities here. I wanted to take a moment to share our thinking on a core principle that guides how we build, especially regarding the impactful, and sometimes daunting, world of generative AI.

Read Post

Cribl

Read more about Why Cribl Copilot Editor is Built for the Human, First and Foremost

The Cribl Copilot Difference - Explainabl and Steerabl AI Built for Humans

Jun 5, 2025 By Nikhil Mungel In Cribl

When you run critical IT or Security systems, you are still on the hook if an alert is missed or a pipeline drops data. That is why Cribl Copilot never tries to replace you. Instead, it works as a partner you can question, guide, and overrule.

Read Post

Cribl

Read more about The Cribl Copilot Difference - Explainabl and Steerabl AI Built for Humans

Blueprints: Ready-Made Processor Bundles For Your Telemetry Pipelines

Jun 5, 2025 By Adnan Rahic In ObservIQ

We’ve noticed a lot of our customers spend countless hours building and configuring processors. Either parsing JSON, standardizing log formats, normalizing timestamps, masking PII, de-duplicating logs, the list never ends. Most work revolves around recreating the same processor bundles in multiple processor nodes. Bindplane’s new Blueprints solves that boring, repetitive work by providing pre-built processor bundles you can drop into any pipeline with a single click.

Read Post

ObservIQ

Read more about Blueprints: Ready-Made Processor Bundles For Your Telemetry Pipelines

How to Configure and Optimize Prometheus Data Retention

Jun 5, 2025 By Preeti Dewani In Last9

Prometheus can be lightweight to start with, but once it’s in production, storage usage tends to grow faster than expected. Managing how long data is kept becomes critical, especially when you're working with limited disk space or tight budgets. This guide outlines the key concepts behind Prometheus data retention, how to configure it effectively, and what to watch out for.

Read Post

Last9

Read more about How to Configure and Optimize Prometheus Data Retention

Shift-Left Monitoring for GitHub and Vercel Workflows

Jun 5, 2025 By Sematext In Sematext

A recent LinkedIn poll by Peter Zaitsev asked: “What is the most common preventable cause of downtime in your environment?” Guess what most respondents said it was? Surprise, surprise – the top answer is Deploying Broken Code, with 57% of respondents selecting it. This reinforces how critical it is to catch issues before they hit production.

Read Post

Sematext

Read more about Shift-Left Monitoring for GitHub and Vercel Workflows

How to Monitor Frontend Memory Usage

Jun 5, 2025 By Sematext In Sematext

First of all, by frontend memory usage I mean the amount of memory that a user’s browser needs when using your website or webapp. Secondly, do you have any idea how much browser memory your website or webapp requires? Or do you know if or how much the memory footprint of your website/webapp has changed over the last few months? Or after the recent changes or releases you made? I’m guessing you don’t. Yet, this is important to monitor to avoid a bad user experience.

Read Post

Sematext

Read more about How to Monitor Frontend Memory Usage

Announcing Go tracer v2.0.0

Jun 5, 2025 By Dario Castañé In Datadog

Datadog has long supported the monitoring of instrumented Go applications through our Go tracer v1. As the Go ecosystem has continued to mature, we’ve been hard at work collecting feedback and improving upon the tracer’s capabilities and usability features. We are now thrilled to announce the release of our Go tracer v2.0.0. This major update includes better security and stability, and a new and simplified API.

Read Post

Datadog

Read more about Announcing Go tracer v2.0.0

Beyond Shift Left: Engineering Leaders Increase Speed and Resilience With Observability

Jun 5, 2025 By Colin Burke In Honeycomb

We recently had the privilege of hosting several industry experts and technology executives across platform strategy, SRE, and engineering enablement for breakfast at our Observability Day in London. We noted that they’re all facing the same fundamental tension: deliver faster, scale smarter, stay resilient, and somehow get ahead of what’s coming next. But how do you move fast without breaking things? And how do you prove the value of the things you don’t break?

Read Post

Honeycomb

Read more about Beyond Shift Left: Engineering Leaders Increase Speed and Resilience With Observability

Catchpoint News Catchup | Episode 1

Jun 5, 2025 By Catchpoint In Catchpoint

Join Sergey, Kelly, Payal, and Leon on our inaugural episode where we talk about recent news from SC Media, LinkedIn, TechCrunch, and CyberSecurity News.

View Video

Catchpoint

Monitoring

Read more about Catchpoint News Catchup | Episode 1

Solve your MTTR mysteries faster with Sumo Logic

Jun 5, 2025 By Merylee Heggem In Sumo Logic

Picture this: a crime scene where the evidence is scattered across five different rooms. There’s a footprint in one, a shattered window in another, a stray shoe on the stairs, and a witness across the street, who only saw part of what happened. Each clue matters in solving the case, but none of them tells the full story on their own.

Read Post

Sumo Logic

Read more about Solve your MTTR mysteries faster with Sumo Logic

Sponsored Post

Smarter alerts using P75 for more signal and less noise

Jun 4, 2025 By Sumitra Manga In Raygun

We've rolled out a new feature in Raygun Alerting that gives you more control over how you track and respond to performance regressions. Starting today, you can now use the 75th percentile (P75) as a filter option for page performance data in Real User Monitoring, such as Core Web Vitals and page load time, right alongside the default 'Average'. This option is available under the "Page/XHR performance change" condition and supports all the Web Vitals metrics we track: Let's break down why this matters, when you should use P75, and how it gives you better, faster insights into how real users are experiencing your site or app.

Read Post

Raygun

Read more about Smarter alerts using P75 for more signal and less noise

Icinga 2 DSL - Variable Scopes

Jun 4, 2025 By Yonas Habteab In Icinga

Ever wondered how Icinga 2 manages all those variables, and how it knows which one to use? In this blog post, we will explore all the different variable scopes in Icinga 2, and by the end, you will know what this mysterious error message means when you see it in your logs.

Read Post

Icinga

Read more about Icinga 2 DSL - Variable Scopes

Introducing Cribl Copilot Editor

Jun 4, 2025 By Cribl In Cribl

Start building complete telemetry pipelines with Cribl Copilot Editor to tackle your toughest filter, transformation and schema mapping challenges today.

View Video

Cribl

Read more about Introducing Cribl Copilot Editor

Integrations: AWS & Logs

Jun 4, 2025 By Honeycomb In Honeycomb

Check out just a few ways to bring data in to Honeycomb. This video highlights one-click AWS integrations, plus familiar log formats. Honeycomb Telemetry Pipeline Manager is an optional component of Honeycomb Telemetry Pipeline, a product available with Honeycomb Enterprise.

View Video

Honeycomb

Read more about Integrations: AWS & Logs

Scaling Observability: How We Designed Bindplane to Manage 1,000,000 OpenTelemetry Collectors

Jun 4, 2025 By Adnan Rahic In ObservIQ

Join the live stream at 11 am ET, here. Platform teams tend to start with just one, or in some cases a handful of OpenTelemetry (OTel) Collectors usually running in gateway mode. They then embrace the benefit of a vendor-neutral, standardized, telemetry collector for unified logs, metrics, and traces.

Read Post

ObservIQ

Read more about Scaling Observability: How We Designed Bindplane to Manage 1,000,000 OpenTelemetry Collectors

Why Does Your Network Get Blamed When Trouble Lies Beyond the Firewall?

Jun 4, 2025 By Yann Guernion In Broadcom

The familiar scene unfolds: Critical applications are sluggish, user complaints are mounting, and the IT war room is buzzing. Eyes quickly dart towards the network team. It’s an almost instinctual reaction. But what happens when the problem isn't within the corporate LAN or even the data center? What if the real culprit lurks somewhere in the vast, untamed wilderness of the internet, a cloud provider's backbone, or a third-party SaaS application’s infrastructure?

Read Post

Broadcom

Read more about Why Does Your Network Get Blamed When Trouble Lies Beyond the Firewall?

The 3 smart updates to our Jira plugin

Jun 4, 2025 By Noorul Huda N In Squared Up

The Jira plugin is one of our most-used integrations and for good reason. Teams rely on it daily to stay on top of work, manage issues, and ship on time. As more people leaned on it, we saw a chance to make the experience even smoother. So, we gave it an upgrade. We’ve refreshed the out-of-the-box dashboards, simplified the data streams, and improved the overall experience. So, let’s take a closer look at what’s changed.

Read Post

Squared Up

Read more about The 3 smart updates to our Jira plugin

Splunk on SGTech - Tech Transforms Life

Jun 4, 2025 By Splunk In Splunk

With the explosion of data across endless environment, devices and applications, organisations and government agencies are faced with a pressing challenge of getting their data house in order to achieve efficiency, transparency, security and governance. Learn how Splunk helps businesses like Singapore Airlines, LG Electronics and DANA fintech group transform complex data into valuable business outcomes and strengthening digital resilience.

View Video

Splunk

Read more about Splunk on SGTech - Tech Transforms Life

How to Improve Uptime and Achieve Root Cause Analysis (with Open Source!)

Jun 4, 2025 By Coroot In Coroot

Observability doesn’t begin and end at telemetry or your ELK stack: most open source or vendor tools require configuration, dashboard customization, and may not actually pinpoint the data you need to mitigate system risks. Coroot was designed to solve the problem of time-consuming root cause analysis: it handles the full observability journey — from collecting telemetry to turning it into actionable insights. We also strongly believe that simple observability should be an innovation everyone can afford to benefit from: which is why our software is open source.

View Video

Coroot

Read more about How to Improve Uptime and Achieve Root Cause Analysis (with Open Source!)

A Developer's Framework for Selecting the Right Tracing Vendor

Jun 4, 2025 By Alexandr Bandurchin In Uptrace

Distributed tracing tracks requests as they flow through microservices, revealing bottlenecks, failures, and performance patterns. Without proper tracing, debugging production issues becomes guesswork—especially in complex architectures with dozens of services. Modern applications generate millions of traces daily. The right vendor helps you extract actionable insights without drowning in data or breaking your budget.

Read Post

Uptrace

Read more about A Developer's Framework for Selecting the Right Tracing Vendor

Why Datadog Falls Short for Log Management and What to Do Instead

Jun 4, 2025 By Mezmo In Mezmo

Datadog may be the default choice for all-in-one observability, but its logging experience takes a back seat to the broader platform. Logs are primarily designed to feed into metrics and traces, which leads to tradeoffs such as slower search, complex workflows, and a UI that isn’t optimized for log investigations. As a result, Datadog doesn’t align with how developers actually troubleshoot.

Read Post

Mezmo

Read more about Why Datadog Falls Short for Log Management and What to Do Instead

How to Log Into a Docker Container

Jun 4, 2025 By Anjali Udasi In Last9

When your Docker container isn't behaving the way you expect, you need to get inside and see what's going on. Maybe your app is throwing errors, a service won't start, or you just need to check some configuration files. Getting into a running Docker container is simpler than you might think, but there are several ways to do it depending on your situation. This guide shows you exactly how to log into Docker containers, troubleshoot common issues, and debug your applications effectively.

Read Post

Last9

Read more about How to Log Into a Docker Container

Map, Transform, Filter: How Copilot Editor Helps Teams (and Their Pipelines) Have It All

Jun 4, 2025 By Desi Gavis-Hughson In Cribl

Ever spent a week wrangling log pipelines just to get your SIEM to stop screaming about missing fields? Wasted way too much time stripping out noisy events and reformatting data for analytics? You’re not the only one. If you work in Security or ITOps, you know the pain: every new data source means another round of schema headaches, more manual mapping, endless field transformations, and a quick prayer that you didn’t break something critical (or let in a flood of junk events).

Read Post

Cribl

Read more about Map, Transform, Filter: How Copilot Editor Helps Teams (and Their Pipelines) Have It All

Building a "Dark Mode" for the UI #coding #programming #webdesign #debugging

Jun 4, 2025 By TrackJS In TrackJS

Eric from @Trackjs talks about the most-requested feature from the redesign project: Dark Mode.

View Video

TrackJS

Read more about Building a "Dark Mode" for the UI #coding #programming #webdesign #debugging

Elastic achieves AWS Education ISV Partner Competency, strengthening education solutions portfolio

Jun 4, 2025 By Udayasimha Theepireddy (Uday), In Elastic

Advancing digital transformation in education through Search AI and cloud innovation We’re thrilled to share that Elastic has achieved the AWS Education ISV Partner Competency. This prestigious designation recognizes Elastic as an Amazon Web Services (AWS) partner that has proven expertise in delivering high-quality solutions that help education institutions support successful student outcomes while protecting security and privacy.

Read Post

Elastic

Read more about Elastic achieves AWS Education ISV Partner Competency, strengthening education solutions portfolio

Upgrade Readiness: Unlocking Success with the Splunk Health Assistant Add-On

Jun 4, 2025 By Karan Kukkar In Splunk

Splunk recently announced exciting updates and significant modernizations for the upcoming releases of Splunk Enterprise and Splunk Cloud Platform. This blog is the first in a series to help prepare your organization for these changes by exploring upgrade readiness best practices. This first installment will highlight the Splunk Health Assistant Add-On, a vital tool that supplements the Splunk Enterprise Monitoring Console, designed to streamline your transition to the next version of Splunk Enterprise.

Read Post

Splunk

Read more about Upgrade Readiness: Unlocking Success with the Splunk Health Assistant Add-On

5 Tips for Managing Client Sites With Oh Dear

Jun 4, 2025 By Sean White In Oh Dear

Managing dozens (or hundreds) of client sites can quickly become chaotic without the right tools. Whether you're running an agency, internal platform team or dev shop, visibility and control are everything. That's where Oh Dear comes in. Oh Dear is an all-in-one monitoring service that gives you a unified dashboard for uptime checks, performance monitoring, broken link detection, SSL and domain expiry alerts, scheduled task validation and more.

Read Post

Oh Dear

Read more about 5 Tips for Managing Client Sites With Oh Dear

Full-Stack Performance and Debugging Workshop

Jun 4, 2025 By Sentry In Sentry

We're going hands-on with all things application development, and making sure that when your code breaks, you have a good way figure out how to fix it. Sentry Build workshops stay in the code, building software, and debugging real problems with the latest tools from Sentry.

View Video

Sentry

Read more about Full-Stack Performance and Debugging Workshop

Scale | Bindplane Launch Week 1 Day 3

Jun 4, 2025 By Bindplane In ObservIQ

Think “order-of-magnitude scale,” not incremental tuning. Your telemetry pipelines are about to feel limitless.

View Video

ObservIQ

Read more about Scale | Bindplane Launch Week 1 Day 3

SentinelOne Outage: Why Early Detection and Independent Monitoring Matter

Jun 4, 2025 By Colin Bartlett In StatusGator

When SentinelOne, a leader in cybersecurity and endpoint protection, experienced a major outage last week, thousands of organizations were suddenly left in the dark. With SentinelOne down for hours, IT and security teams scrambled for information and updates. But there was a critical missing piece: SentinelOne has no public status page. This gap left customers frustrated, searching for answers on social media, Reddit, and unofficial channels.

Read Post

StatusGator

Read more about SentinelOne Outage: Why Early Detection and Independent Monitoring Matter

Real-Time Observability with ClickHouse, Coroot, and GlassFlow

Jun 4, 2025 By Pablo Pardo Garcia In Coroot

Coroot is excited to feature an editorial from GlassFlow for our first Open Source Spotlight. We hope to improve the workflow of our global community of SREs and DevOps professionals by sharing exciting projects like Glassflow, which make innovation accessible for everyone through the freedom of open source. If you have an open source or open core project you’d like to see on our blog next, send us a message!

Read Post

Coroot

Read more about Real-Time Observability with ClickHouse, Coroot, and GlassFlow

Comparison of the Best and Most Popular NoSQL Databases

Jun 4, 2025 By DNSstuff tech team In SolarWinds

Traditional databases store data in structured tables, whereas NoSQL (non-SQL) databases use more flexible, non-tabular storage methods. NoSQL databases can store a wider range of data types, including document stores, wide columns, key-value stores, and graphs. These databases first emerged in the late 2000s to support massive horizontal scaling and high-throughput workloads for web applications.

Read Post

SolarWinds

Read more about Comparison of the Best and Most Popular NoSQL Databases

Inside the Wins: Real Stories of Transforming Azure Observability into Business Value

Jun 4, 2025 By LogicMonitor In LogicMonitor

Azure environments are growing fast, and so are the challenges of monitoring them at scale. In this blog, part of our Azure Monitoring series, we look at how real ITOps and CloudOps teams are moving beyond Azure Monitor to achieve hybrid visibility, faster troubleshooting, and better business outcomes. These real-life customer stories show what’s possible when observability becomes operational. Want the full picture? Explore the rest of the series.

Read Post

LogicMonitor

Read more about Inside the Wins: Real Stories of Transforming Azure Observability into Business Value

Best practices for end-to-end custom metrics governance

Jun 3, 2025 By Colten Woo In Datadog

Custom metrics enable you to track what matters to your distinct business and services and correlate it with the rest of your telemetry data. As your organization grows by adding more teams, services, and environments, your volume of custom metrics can grow with it. To ensure critical visibility while maintaining cost efficiency, organizations need an end-to-end approach to custom metrics governance.

Read Post

Datadog

Read more about Best practices for end-to-end custom metrics governance

Java License Monitoring - Why you need to monitor your Java licenses and how to do so

Jun 3, 2025 By Babu Sundaram In eG Innovations

Java license monitoring has now become an essential requirement for many organizations as Oracle’s recent licensing changes have made compliance mandatory, with increased risks of audits and higher Java licensing compliance costs. Once a free programming platform, Java now requires navigating a complex licensing framework, including employee-based models that tie costs to the size of a workforce. These changes significantly increase the risk of unbudgeted expenses for licensing violations.

Read Post

eG Innovations

Read more about Java License Monitoring - Why you need to monitor your Java licenses and how to do so

Gamers Have Real Skills-Especially in Tech!

Jun 3, 2025 By solarwindsinc In SolarWinds

Gaming, especially MMOs, builds real social skills—and in tech, that actually matters. Let's talk about how the "guy in a basement" stereotype misses the mark and why gaming deserves the same respect as any other hobby.

View Video

SolarWinds

Read more about Gamers Have Real Skills-Especially in Tech!

Monitor OpenTelemetry-native metrics with Datadog

Jun 3, 2025 By Shanel Huang In Datadog

OpenTelemetry (OTel) is emerging as the industry standard for collecting and transmitting observability data. Datadog supports several ways to send and accept OTel-native data, while also continuing to support its own native telemetry format. To provide a consistent monitoring experience, Datadog now supports using OTel-native metrics alongside Datadog-native metrics across dashboards, queries, and core visualizations in the Datadog platform.

Read Post

Datadog

Read more about Monitor OpenTelemetry-native metrics with Datadog

Operational Resilience in 2025: Meeting New Standards, Mitigating New Risks

Jun 3, 2025 By ScienceLogic In ScienceLogic

In a world of constant disruption, operational resilience is now mission critical. From cyberattacks and misconfigurations to vendor outages and natural disasters, today’s enterprises are navigating risks that move faster and hit harder than ever before. As we enter 2025, operational resilience has evolved from a best practice to a board-level imperative.

Read Post

ScienceLogic

Read more about Operational Resilience in 2025: Meeting New Standards, Mitigating New Risks

IETF Decreased Mean Response Time by 90% with Scout APM!

Jun 3, 2025 By Sarah Morgan In Scout

The Internet Engineering Task Force (IETF) is the premier Internet standards body, developing open standards through open processes. The IETF is a large open international community of network designers, operators, vendors, and researchers concerned with the evolution of Internet architecture and the smooth operation of the Internet. The IETF standards-setting process is open to any individual interested in providing technical contributions.

Read Post

Scout

Read more about IETF Decreased Mean Response Time by 90% with Scout APM!

Bindplane Launch Week 1 [June 2-6] - Day 2 - Custom OTel Collectors

Jun 3, 2025 By Bindplane In ObservIQ

The point of OpenTelemetry has been to give you a choice. Yet, most observability vendors still insist you run their collector. We’re removing that last point of friction. With Bring Your Own Collector (BYOC), Bindplane now accepts any upstream-compatible build, recognizes exactly which receivers, processors, and exporters it contains, and adapts the UI and configuration workflow on the fly. No forks, no vendor stamp—just the collector you already trust, fully managed by Bindplane.

View Video

ObservIQ

Read more about Bindplane Launch Week 1 [June 2-6] - Day 2 - Custom OTel Collectors

From Archive to Insight: Ingest Splunk DDSS Data Directly into Cribl Lake

Jun 3, 2025 By Rick Salsa and In Cribl

If you’re a Splunk user relying on Dynamic Data Self Storage (DDSS) or Active Archive (DDAA), you’re likely familiar with this frustrating loop.

Read Post

Cribl

Read more about From Archive to Insight: Ingest Splunk DDSS Data Directly into Cribl Lake

Introducing our improved uptime check

Jun 3, 2025 By Freek Van der Herten In Oh Dear

The past few months, we’ve working on improving our uptime check. We proud to announce that this improved check is now available for all users. You don’t have to do anything to get it (unless you are not subscribed to Oh Dear, in that case your should subscribe to Oh Dear ), all our users now have it enabled by default. In this blogpost, I’d like to give an overview of the changes and some background why we changed some things.

Read Post

Oh Dear

Read more about Introducing our improved uptime check

Telemetry for Modern Apps

Jun 3, 2025 By Checkly In Checkly

This video features a joint session hosted by the Checkly team in collaboration with Mezmo. We discuss current practices and challenges in web application monitoring. The session explores practical strategies, shared experiences from both teams, and insights into how modern engineering teams are approaching observability and reliability at scale.

View Video

Checkly

Read more about Telemetry for Modern Apps

Optimizing the end-user experience: How to perform a browser check in Grafana Cloud Synthetic Monitoring

Jun 3, 2025 By Bukola Ayodele In Grafana

Synthetic monitoring is a vital practice to proactively track the health and performance of web applications. Instead of waiting for users to report problems, synthetic monitoring helps developers catch issues before they impact real users. One powerful type of synthetic monitoring is the browser check. These checks go beyond basic ping checks, simulating how a user would actually interact with your website’s interface.

Read Post

Grafana

Read more about Optimizing the end-user experience: How to perform a browser check in Grafana Cloud Synthetic Monitoring

How to send alerts from Grafana OSS to Grafana Cloud IRM

Jun 3, 2025 By Pepe Cano In Grafana

In March, we announced that Grafana OnCall (OSS) had entered maintenance mode. However, OnCall’s development continues in Grafana Cloud as Grafana Cloud IRM, combining on-call management and incident response into one integrated solution. Many users told us they still want to self-host Grafana and rely on Grafana Alerting to detect potential issues early—but they also need to escalate and manage incidents using an incident response management (IRM) solution.

Read Post

Grafana

Read more about How to send alerts from Grafana OSS to Grafana Cloud IRM

How to send alerts from self-hosted Grafana to Grafana Cloud IRM

Jun 3, 2025 By Grafana In Grafana

Learn how to send alerts from Grafana OSS or Grafana Enterprise to Grafana Cloud IRM. In this quick demo, we'll show you how to set up the integration between your self-hosted instance and our managed solution for consolidating, customizing, and automating incident response and management. Grafana Cloud is the easiest way to get started with Grafana dashboards, metrics, logs, and traces. Our forever-free tier includes access to 10k metrics, 50GB logs, 50GB traces and more.

View Video

Grafana

Read more about How to send alerts from self-hosted Grafana to Grafana Cloud IRM

k8s-monitoring Helm Chart Office Hours - 2025-05-27

Jun 3, 2025 By Grafana In Grafana

In the May edition of the Kubernetes Monitoring Helm chart office hours, we discuss the recent version 2.1 release as well as changes to how the pod label is handled. We also discuss some exciting upcoming features. Finally, we end with a very short Q&A. Chapters.

View Video

Grafana

Read more about k8s-monitoring Helm Chart Office Hours - 2025-05-27

Edge Data Replication: Contributions and Status Updates for InfluxDB 3

Jun 3, 2025 By Anais Dotis-Georgiou In InfluxData

If you’ve ever stood up multiple edge InfluxDB instances in remote locations and wished you could consolidate their data into a centralized instance for analysis, you’re not alone. That’s exactly why we designed Edge Data Replication (EDR) in InfluxDB v2. Now, with InfluxDB 3 Core and 3 Enterprise, we’re seeing new ways to handle replication using the brand-new Python Processing Engine.

Read Post

InfluxData

Read more about Edge Data Replication: Contributions and Status Updates for InfluxDB 3

Your Collector, Your Rules: Introducing BYOC and the OpenTelemetry Distribution Builder

Jun 3, 2025 By Adnan Rahic In ObservIQ

Join the live stream at 11 am ET, here. OpenTelemetry’s super-power has always been: Choice. Yet, most observability vendors still insist you run their collector. Today we’re removing that last point of friction. With Bring Your Own Collector (BYOC), Bindplane now accepts any upstream-compatible build, recognizes exactly which receivers, processors, and exporters it contains, and adapts the UI and configuration workflow on the fly.

Read Post

ObservIQ

Read more about Your Collector, Your Rules: Introducing BYOC and the OpenTelemetry Distribution Builder

Custom Collectors | Bindplane Launch Week 1 Day 2

Jun 3, 2025 By Bindplane In ObservIQ

The point of OpenTelemetry has been to give you a choice. Yet, most observability vendors still insist you run their collector. We’re removing that last point of friction.

View Video

ObservIQ

Read more about Custom Collectors | Bindplane Launch Week 1 Day 2

How to Set Up Tracing for Elixir Apps Using AppSignal

Jun 3, 2025 By Aestimo Kirina In AppSignal

Over time, web applications have evolved from simple request/response-based systems into complex, distributed ones with lots of moving parts. If something goes wrong (and you can be sure it will), finding the cause can be nearly impossible. But this need not be the case: enter tracing. Tracing refers to the process of collecting detailed information about the execution of requests within an application, including function calls, execution time, and other relevant data.

Read Post

AppSignal

Read more about How to Set Up Tracing for Elixir Apps Using AppSignal

Agentic AI: Powerful But Fragile-What You Need to Know

Jun 3, 2025 By Howard Beader In Catchpoint

Just when you’d finally wrapped your head around AI, here comes its autonomous cousin, Agentic AI. Think of it as AI that doesn’t just assist, but acts. It makes decisions, handles tasks, and communicates with other systems on its own. While it’s revolutionizing supply chains and customer experiences, there’s a catch. These autonomous agents rely on a plethora of third-party services, and when one fails, everything stops.

Read Post

Catchpoint

Read more about Agentic AI: Powerful But Fragile-What You Need to Know

Identifying Idle Paths in a Data Center Leaf-Spine Fabric

Jun 3, 2025 By Phil Gervasi In Kentik

In leaf-spine data center networks, traffic often becomes imbalanced, leaving some uplinks idle and resulting in wasted bandwidth. Kentik helps engineers identify underutilized paths, diagnose the causes, and take corrective action using enriched telemetry, visual topology maps, and intelligent alerts, turning hidden inefficiencies into actionable insights.

Read Post

Kentik

Read more about Identifying Idle Paths in a Data Center Leaf-Spine Fabric

How to Fix Latency Spikes in WAN and LAN Networks

Jun 3, 2025 By Andrii Kernitskyi In Obkio

Even a few seconds of delay in your network can be the difference between closing a deal on a video call, or watching it buffer into oblivion. These delays, known as latency spikes, are unpredictable surges in the time it takes for data to travel across your network. Whether you're running a cloud-based CRM, managing VoIP calls across offices, or supporting remote teams on Microsoft Teams or Zoom, latency spikes can disrupt productivity, hinder performance, and lead to a flood of support tickets.

Read Post

Obkio

Read more about How to Fix Latency Spikes in WAN and LAN Networks

Peacetime Observability: Spotting Risks Before They Become Incidents

Jun 3, 2025 By Nikolay Sivko In Coroot

Most of the time, nothing’s broken. Traffic’s flowing, alerts are quiet, and everything seems fine. That’s peacetime, when no one’s getting paged. Coroot helps in both peacetime and wartime. When things go wrong, it guides you to the root cause fast. But during peacetime, it helps you spot risks early, clean up inefficiencies, and prevent those incidents from happening in the first place.

Read Post

Coroot

Read more about Peacetime Observability: Spotting Risks Before They Become Incidents

Graylog vs ELK: Which Log Management Solution Fits Your Stack?

Jun 3, 2025 By Faiz Shaikh In Last9

Your app logs start simple—maybe a few print() or logging.info() calls. But in production, things get noisy. Thousands of log lines per minute, scattered across services, and it’s hard to know what matters. This is when tools like Graylog and the ELK stack help. They let you collect, search, and make sense of logs, but they do it in different ways. This guide breaks down how each one handles setup, scale, and day-to-day use.

Read Post

Last9

Read more about Graylog vs ELK: Which Log Management Solution Fits Your Stack?

Unlocking Real-Time Collaboration: Why Your Network Is the Key to Vibe Working

Jun 3, 2025 By Teneo In Teneo

Lately, there has been a growing buzz around the concept of “Vibe Working,” where teams are leveraging AI to dynamically share, develop, test, and transform “fuzzy” ideas into something useful in real-time. I view this approach as one of the next significant evolutions in our professional and technological landscape. Reflecting on my own journey in technology, I’ve observed how the pace of innovation and collaboration continually reshapes our daily workflows.

Read Post

Teneo

Read more about Unlocking Real-Time Collaboration: Why Your Network Is the Key to Vibe Working

How to Monitor and Manage Grafana Memory

Jun 3, 2025 By Anjali Udasi In Last9

It’s late, you get an alert, and Grafana is down. The reason? It ran out of memory. If you’ve ever watched Grafana slowly eat up RAM until it just stops responding, you know how frustrating that can be. Memory can spike quickly, especially with complex dashboards and multiple data sources. This guide will help you understand what’s going on and how to keep Grafana running without surprises.

Read Post

Last9

Read more about How to Monitor and Manage Grafana Memory

Top five metrics to monitor in IIS Logs

Jun 3, 2025 By David Girvin In Sumo Logic

When managing and troubleshooting IIS (Internet Information Services) web server performance, logs are a critical resource. They capture detailed information about every request and response so your team can detect issues quickly. Let’s walk through the main IIS log formats, explore a sample log file, and break down five key types of IIS metrics you should monitor.

Read Post

Sumo Logic

Read more about Top five metrics to monitor in IIS Logs

NiCE DB2 Management Pack 5.40

Jun 2, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

NiCE is proud to announce the availability of the NiCE DB2 Management Pack 5.40, a new milestone in advanced monitoring and management for IBM DB2 environments. Version 5.40 introduces powerful enhancements that improve efficiency, compatibility, and ease of use: Cluster Synchronization Improvements Ensures more accurate and efficient configuration sync across clustered deployments.

Read Post

NiCE IT Mgmt

Read more about NiCE DB2 Management Pack 5.40

Service Level Objectives -- Customer Brown Bag -- May 29th, 2025

Jun 2, 2025 By Sumo Logic, Inc. In Sumo Logic

This technical session on Service Level Objectives (SLOs) will cover the fundamentals of SLOs, SLIs, and SLAs, along with how to define, monitor, and optimize them for system reliability. Through hands-on demonstrations, you'll learn to set up SLOs in Sumo Logic, track performance using logs, metrics, and tracing, and configure proactive alerts for incident response. By the end, you’ll have the skills to implement and manage SLOs effectively, ensuring your services meet reliability goals while balancing performance and cost.

View Video

Sumo Logic

Read more about Service Level Objectives -- Customer Brown Bag -- May 29th, 2025

Telemetry Routing | Bindplane Launch Week 1 Day 1

Jun 2, 2025 By Bindplane In ObservIQ

Bindplane’s philosophy has always been rooted in enabling you to receive telemetry from any source, process it efficiently, and deliver it to any destination. You’ll see the launch of brand new integrations.

View Video

ObservIQ

Read more about Telemetry Routing | Bindplane Launch Week 1 Day 1

Introducing RUM without Limits: Capture everything, keep what matters

Jun 2, 2025 By Bridgitte Kwong In Datadog

Real User Monitoring (RUM) helps teams understand exactly how their users experience their web and mobile applications—from load times to crashes and frustration signals. But traditional RUM models come with tough trade-offs: capture all sessions and overspend, or sample data and miss what matters. Fixed sampling rates may help manage volume, but they leave dangerous blind spots.

Read Post

Datadog

Read more about Introducing RUM without Limits: Capture everything, keep what matters

Unify telemetry, own your pipeline: New integrations for Windows, Network Telemetry, and Cloud Storage

Jun 2, 2025 By Adnan Rahic In ObservIQ

Today, we're expanding on the integrations front, and launching new integrations for Windows events, network telemetry, and cloud storage. Here's a quick tour of what's new and why it matters.

Read Post

ObservIQ

Read more about Unify telemetry, own your pipeline: New integrations for Windows, Network Telemetry, and Cloud Storage

What Are The Top Website Monitoring Services in 2025?

Jun 2, 2025 By Meenz Nautiyal In WebSitePulse

Every business owner understands the importance of website monitoring. It is essential to avoid website performance and availability issues. A great start would be to examine every aspect of your web infrastructure. That's where website monitoring tools come into the picture. With website monitoring services, you can continuously observe your website's performance and uptime. These tools make you aware of any server downtime or connection issues.

Read Post

WebSitePulse

Read more about What Are The Top Website Monitoring Services in 2025?

Monitoring Backstage with OpenTelemetry:Closing the observability blind spot

Jun 2, 2025 By Elizabeth Mathew In SigNoz

‘One small step for a man, but a huge leap for developers’ — me, when I realised how to observe my Backstage with OpenTelemetry. Backstage is often the “portal” through which we manage all our other systems, but who watches the watcher? Recently, we gave a KubeCon Talk, highlighting that monitoring Backstage itself is critical. When Backstage isn’t observable, it becomes a blind spot in your infrastructure.

Read Post

SigNoz

Read more about Monitoring Backstage with OpenTelemetry:Closing the observability blind spot

OnlineOrNot updates from May 2025

Jun 2, 2025 By Max Rozen In OnlineOrNot

As OnlineOrNot has grown, I've been building features quickly to get them into your hands as fast as possible. However, this meant I ended up with multiple versions of similar pages that looked and worked differently from each other. This month, I focused on putting systems in place to create a consistent experience across all parts of the dashboard, making everything look and feel unified.

Read Post

OnlineOrNot

Read more about OnlineOrNot updates from May 2025

Hybrid IT Infrastructure Management

Jun 2, 2025 By Blerim Sheqa In Icinga

Today’s IT environments are rarely confined to a single data center or a single cloud provider. Enterprises are embracing a mix of cloud platforms, virtual machines, and on-premises hardware to stay agile and competitive. This blended environment is known as hybrid IT infrastructure, and managing it effectively is key to keeping systems healthy, secure, and performing at their best.

Read Post

Icinga

Read more about Hybrid IT Infrastructure Management

Quickly Visualize Your Cribl Edge Environment with the New Search Pack

Jun 2, 2025 By Perry Correll and In Cribl

Have you heard that Search Packs are now officially generally available?

Read Post

Cribl

Read more about Quickly Visualize Your Cribl Edge Environment with the New Search Pack

Simple cloud cost management: Grafana Labs integrates open standard FOCUS specification for cloud billing data

Jun 2, 2025 By Rich Kreitz In Grafana

At Grafana Labs, we’ve always believed that observability should be open and accessible — that belief extends beyond metrics, logs, and traces to the costs associated with managing observability at scale. That’s why we’re excited to share that we’ve adopted the FinOps Open Cost and Usage Specification ( FOCUS), a community-driven, open standard for cloud billing data.

Read Post

Grafana

Read more about Simple cloud cost management: Grafana Labs integrates open standard FOCUS specification for cloud billing data

May product updates

Jun 2, 2025 By Colin Bartlett In StatusGator

In May we’ve released powerful new features to help you stay ahead of outages and communicate status more effectively, focusing on expanding Early Warning Signals across our platform. Let’s dive into what’s new!

Read Post

StatusGator

Read more about May product updates

Sigma Specification 2.0: What You Need to Know

Jun 2, 2025 By Jeff Darrington In Graylog

Sigma rules have become the security team equivalent of LEGO bricks and systems. With LEGO, people can build whatever they can imagine by connecting different types of bricks. With Sigma Specification 2.0 rules, security teams can create vendor-agnostic detections without being limited by proprietary log formats. In response to the Sigma rules’ popularity, the team that built them updated them in August 2024, giving security teams new capabilities.

Read Post

Graylog

Read more about Sigma Specification 2.0: What You Need to Know

Bindplane Launch Week 1 [June 2-6] - Day 1 - New Integrations

Jun 2, 2025 By Bindplane In ObservIQ

Today, we're expanding on the integrations front, and launching new integrations for Windows events, network telemetry, and cloud storage. Here's a quick tour of what's new and why it matters.

View Video

ObservIQ

Read more about Bindplane Launch Week 1 [June 2-6] - Day 1 - New Integrations

Jaeger vs Zipkin: Which is Right for Your Distributed Tracing

Jun 2, 2025 By Anjali Udasi In Last9

When requests slow down across your microservices, tracing helps you understand where time is spent. Jaeger and Zipkin are two popular tools for distributed tracing, built to answer a simple question: where did the request go? If you're choosing between them or just exploring options, this guide breaks down the differences and when each one might be a better fit.

Read Post

Last9

Read more about Jaeger vs Zipkin: Which is Right for Your Distributed Tracing

Prometheus Alerting Examples for Developers

Jun 2, 2025 By Prathamesh Sonpatki In Last9

Everything looks fine—dashboards are green, logs are quiet. But users start reporting slow response times. No errors, no traffic spikes. Just a general slowdown. It’s a common situation. Not all problems show up as crashes or clear failures. Sometimes, performance degrades quietly, and standard metrics don’t catch it early. But that's where Prometheus alerting can help, if you're monitoring the right signals.

Read Post

Last9

Read more about Prometheus Alerting Examples for Developers

Why Resilience, Not Just Visibility, Is the New Mandate

Jun 1, 2025 By Raja Shekar Mulpuri In HEAL Software

We’ve been in the war rooms. We’ve watched revenue, reputation, and trust erode in real time—not because we lacked telemetry, but because we lacked architecture. Modern enterprise systems fail because their data doesn’t think. Their tooling doesn’t remember. And their automation doesn’t know when to act—or when to stop. The answer is not more monitoring. It’s not dashboards with AI labels.

Read Post

HEAL Software

Read more about Why Resilience, Not Just Visibility, Is the New Mandate

Operations | Monitoring | ITSM | DevOps | Cloud