Monthly Archive

Sponsored Post

3 Ways to Break Down SaaS Data Silos

Jan 31, 2026 By David Bunting In ChaosSearch

Access to data is critical for SaaS companies to understand the state of their applications, and how that state affects customer experience. However, most companies use multiple applications, all of which generate their own independent data. This leads to data silos, or a group of raw data that is accessible to one stakeholder or department and not another. Data silos also prevent information from different sources from being blended together to gain a more accurate picture of what's happening in your application.

Read Post

ChaosSearch

Read more about 3 Ways to Break Down SaaS Data Silos

New Relic vs Splunk - In-depth Comparison [2026]

Jan 31, 2026 By Pavithra Parthiban In Atatus

New Relic and Splunk are two prominent tools in the world of observability and monitoring, each serving distinct purposes. New Relic is used for Application Performance Monitoring (APM), offering a full-stack observability platform. It is important to note that New Relic is not a SIEM tool, its primary focus is performance monitoring. On the other hand, Splunk is used for log management, machine data analytics, and is widely utilized as a SIEM tool.

Read Post

Atatus

Read more about New Relic vs Splunk - In-depth Comparison [2026]

Sponsored Post

RISE with SAP Monitoring: Overcoming the 'Black Box' Challenge of monitoring Cloud ERP

Jan 30, 2026 By Avantra Team In Avantra

Organizations transitioning from traditional on-premises SAP systems to Cloud ERP (formerly known as "RISE" and "GROW") have a new set of monitoring challenges. Unlike the familiar on-prem landscape, where IT teams enjoyed full visibility and control, cloud environments can feel like a "black box," with limited direct access to the underlying infrastructure and reliance on service tickets to understand system status.

Read Post

Avantra

Read more about RISE with SAP Monitoring: Overcoming the 'Black Box' Challenge of monitoring Cloud ERP

Grafana 12, from the founder's perspective: design, scale, and the next chapter

Jan 30, 2026 By Grafana Labs Team In Grafana

Sometimes the most interesting engineering stories don’t start with a roadmap or a release plan—they start with personal taste. A preference for good design. A frustration with clunky tools. A desire to see everything in one place.

Read Post

Grafana

Read more about Grafana 12, from the founder's perspective: design, scale, and the next chapter

How we cut our NLQ agent debugging time from hours to minutes with LLM Observability

Jan 30, 2026 By Florent Le Gall In Datadog

This article is part of our series on how Datadog engineering teams use LLM Observability to build, monitor, and improve AI-powered systems. At Datadog, we’re always looking for ways to make complex data easier to explore.

Read Post

Datadog

Read more about How we cut our NLQ agent debugging time from hours to minutes with LLM Observability

The Future of Dashboards: Git Sync, SQL Expressions, and Dynamic Layouts | Big Tent S3E5

Jan 30, 2026 By Grafana In Grafana

In this episode of Grafana’s Big Tent, Grafana founder Torkel Ödegaard joins Mat Ryer and Tom Wilkie for a wide-ranging conversation about how Grafana began, why design and usability mattered from day one, and how the project evolved into a platform used by tens of millions — from developers to power stations and even space missions.

View Video

Grafana

Read more about The Future of Dashboards: Git Sync, SQL Expressions, and Dynamic Layouts | Big Tent S3E5

Top 15 Application Performance Metrics for Developers and SREs in 2026

Jan 30, 2026 By Mohana Ayeswariya J In Atatus

Every application tells a story of user intent, system behavior, and business impact. To truly understand how your application performs, you need to go beyond logs and errors. You need metrics that provide actionable visibility across your stack. Application performance metrics are the foundation for delivering high-quality digital experiences, and they empower DevOps teams, developers, engineers, and site reliability engineers (SREs) to respond faster, scale smarter, and continuously improve.

Read Post

Atatus

Read more about Top 15 Application Performance Metrics for Developers and SREs in 2026

Building with the InfluxDB 3 MCP Server & Claude

Jan 30, 2026 By Suyash Joshi In InfluxData

InfluxDB 3 Model Context Protocol (MCP) server lets you manage and query InfluxDB 3 (Core, Enterprise, Dedicated, Serverless, Clustered) using natural language through popular LLM tools like Claude Desktop, ChatGPT Desktop, and other MCP-compatible agents. The setup is straightforward. In this article, we will focus on setting up InfluxDB 3 Enterprise using Docker with Claude Desktop.

Read Post

InfluxData

Read more about Building with the InfluxDB 3 MCP Server & Claude

Web Performance Metrics: Why INP Is Your Most Practical UX Performance KPI

Jan 30, 2026 By Germain UX Team In Germain UX

Every developer has seen this scene: a user clicks a button, nothing happens, they click again—still nothing—and by the third frustrated tap, three overlapping modals explode onto the screen. The page wasn’t slow to load. It was slow to respond. This highlights the importance of perceived performance—how fast and responsive a website feels to users—which can shape user satisfaction regardless of actual load times.

Read Post

Germain UX

Read more about Web Performance Metrics: Why INP Is Your Most Practical UX Performance KPI

How Agentic AI is Redefining Network Operations

Jan 30, 2026 By Dallon Robinette In Selector

For much of the past decade, many of the most ambitious ideas in artificial intelligence lived primarily in research papers, labs, and long-term roadmaps. Agentic AI was no exception. The concept of AI systems capable of reasoning, planning, and acting autonomously was widely discussed but largely theoretical. But earlier this month, Gartner released its report The Future of NetOps Is Agentic, reflecting a growing consensus that this has changed. What was once conceptual is now becoming operational.

Read Post

Selector

Read more about How Agentic AI is Redefining Network Operations

Context engineering: The missing layer for trusted AI in financial services

Jan 30, 2026 By Karen Mcdermott In Elastic

Financial services AI demands more than models and prompts. Context engineering provides real-time, governed, and explainable intelligence with Elastic serving as the foundational context layer. Artificial intelligence in financial services is no longer constrained by model capability. The real bottleneck is context.

Read Post

Elastic

Read more about Context engineering: The missing layer for trusted AI in financial services

Setting up Malware/Virus Check

Jan 30, 2026 By Uptime Website Monitoring In uptime

In this video, we’ll walk you through on how to set up and configure your Malware/Virus check in Uptime.com.

View Video

uptime

Read more about Setting up Malware/Virus Check

How to Fix DNS Problems

Jan 30, 2026 By Coroot In Coroot

All the problems that could go wrong with DNS and why "It's always a Freaking DNS Issue" according to DevOps movement co-founder and open source advocate Kris Buytaert.

View Video

Coroot

Read more about How to Fix DNS Problems

Auvik's 2026 IT & Network Management Predictions

Jan 30, 2026 By Bob Wientzen In Auvik

As IT environments become more distributed, automated, and AI-driven, 2026 will represent a major inflection point for how organizations manage networks, security, and operational resilience. From shadow AI and governance to AI-driven automation and economic uncertainty, Auvik’s executive leadership team shares their predictions on what’s coming, and what IT leaders and MSPs should be preparing for now.

Read Post

Auvik

Read more about Auvik's 2026 IT & Network Management Predictions

Top tips: Why the most underrated tech skill today Is interpretation

Jan 29, 2026 By Nandana Ann Mathew In ManageEngine

Top Tips is a weekly column where we highlight what’s trending in the tech world today and list ways to explore these trends. This week, we’re looking at why interpretation matters when messages, meetings, and notifications never seem to stop. We live in a world where messages travel faster than meaning. Emails are sent in seconds, chats stack up by the hour, and meetings are recorded, transcribed, and summarized before we’ve had time to process what was actually said.

Read Post

ManageEngine

Read more about Top tips: Why the most underrated tech skill today Is interpretation

Notes from the Field: Ivanti Workspace Control blocking user logoff on Windows Server 2025

Jan 29, 2026 By GripMatix In GripMatix

As part of our day-to-day consulting work at GripMatix, we spend a significant amount of time in various customer environments where we are designing, validating, and troubleshooting EUC platforms. This particular issue surfaced during work for one of our customers, where we were validating Ivanti Workspace Control (IWC) on a new Windows Server 2025 environment.

Read Post

GripMatix

Read more about Notes from the Field: Ivanti Workspace Control blocking user logoff on Windows Server 2025

From PaaS to Observability: Implementing OTel with VictoriaMetrics

Jan 29, 2026 By VictoriaMetrics In VictoriaMetrics

The final piece of the PaaS puzzle is observability. Once the platform is built, the challenge shifts to managing the volume of data generated by distributed services. In our first Tech Talk of 2026, Mathias and Marc discuss the technical path from platform deployment to standardized observability. We focus on the practical implementation of OpenTelemetry (OTel) and why choosing a high-performance backend is critical to avoiding the "Observability Tax.".

View Video

VictoriaMetrics

Monitoring

Read more about From PaaS to Observability: Implementing OTel with VictoriaMetrics

Getting Started with Splunk Dashboards

Jan 29, 2026 By Blog In Squared Up

Splunk is a leading platform for searching, monitoring, and analyzing logs across IT tools and systems. Well-known for its ability to handle vast volumes of log and event data, Splunk empowers organizations to gain real-time visibility into their systems and operations. However, while Splunk offers rich telemetry and analytics, its dashboards can sometimes become complex - making it difficult to surface the most critical insights quickly. That’s where SquaredUp can elevate the experience.

Read Post

Squared Up

Read more about Getting Started with Splunk Dashboards

Debug PostgreSQL query latency faster with EXPLAIN ANALYZE in Datadog Database Monitoring

Jan 29, 2026 By Austin Bergstrom In Datadog

In PostgreSQL, the EXPLAIN ANALYZE statement gives you a detailed report of what actually happens when you execute a query. This kind of information is important for troubleshooting slow queries, but using EXPLAIN ANALYZE to collect this data is often challenging in a production environment. Datadog Database Monitoring now supports automatic collection of EXPLAIN ANALYZE plans for PostgreSQL, enabling you to easily capture execution details that help you troubleshoot slow queries.

Read Post

Datadog

Read more about Debug PostgreSQL query latency faster with EXPLAIN ANALYZE in Datadog Database Monitoring

Tempo 2.10 release: new TraceQL features, LLM-optimized API responses, vParquet5, and more

Jan 29, 2026 By Tiffany Jernigan In Grafana

Tempo 2.10 has arrived, delivering TraceQL enhancements, improved cardinality management for the metrics-generator, vParquet5, and more. You can continue reading and check out the video below to learn more about these and other new features. The Tempo 2.10 release notes and changelog provide more in-depth details and include all of the changes that came with this release.

Read Post

Grafana

Read more about Tempo 2.10 release: new TraceQL features, LLM-optimized API responses, vParquet5, and more

Redefining Application Management Services - the AIOps Way

Jan 29, 2026 By Swaminathan J In eG Innovations

For years, Application Management/Maintenance Services (AMS) have been the go-to solution for IT leaders trying to keep their business applications stable and running. The AMS pitch was simple: Hand over your apps to us, and we’ll manage and maintain them for you! And for a long time, that model has delivered promising results. It allows internal teams to focus on innovation while service providers handle the operational heavy lifting.

Read Post

eG Innovations

Read more about Redefining Application Management Services - the AIOps Way

What is DevOps?

Jan 29, 2026 By Coroot In Coroot

Learn what DevOps is from a founder of the movement, Co-founder of DevOpsDays, O11y, and Inuits, FOSS advocate: Kris Buytaert.

View Video

Coroot

Read more about What is DevOps?

How to Choose the Right API Monitoring Tool for Production Environments

Jan 29, 2026 By Dotcom-Monitor In Dotcom-Monitor

APIs are no longer just technical connectors between systems; they are production infrastructure. Customer-facing applications, partner integrations, payment flows, and internal microservices all depend on APIs working correctly, consistently, and at scale. When an API fails, the impact is rarely limited to a single endpoint; it can disrupt user journeys, compromise revenue, and breach service-level agreements (SLAs).

Read Post

Dotcom-Monitor

Read more about How to Choose the Right API Monitoring Tool for Production Environments

What's New in Progress Flowmon 13?

Jan 29, 2026 By Progress Flowmon In Flowmon

Join security product experts Filip Cerny and Jan Stritezsky as they unveil the latest innovations in Flowmon 13. This live session will walk you through the most impactful upgrades designed to boost performance, streamline investigations and enhance security visibility across your network.

View Video

Flowmon

Read more about What's New in Progress Flowmon 13?

AI Agent Monitoring | Debugging Next.js Applications with Sentry

Jan 29, 2026 By Sentry In Sentry

If you're building a Next.js application that has AI capabilities inside it, Agent Monitoring helps put all the AI context with the rest of the data inside Sentry. Correlating tool call duration with database performance and token consumption helps debug in full-stack context.

View Video

Sentry

Read more about AI Agent Monitoring | Debugging Next.js Applications with Sentry

IT as the Proving Ground for AI: Driving Enterprise Innovation

Jan 29, 2026 By Digitate In Digitate

As per the Enterprise AI Survey conducted by Digitate in collaboration with Sapio Research revealed that IT operations have emerged as the primary proving ground for artificial intelligence in the enterprise. With 78% of organizations already deploying AI in IT, 65% identifying ITOps as the biggest AI beneficiary, and adoption outpacing every other function, IT leads enterprise AI maturity.

Read Post

Digitate

Read more about IT as the Proving Ground for AI: Driving Enterprise Innovation

Less code, faster builds, same telemetry: Turbopack support for the Next.js SDK

Jan 29, 2026 By Sergiy Dybskiy In Sentry

TL;DR - Turbopack became the default in Next.js, so we reworked our SDK to stop depending on bundlers. The result is less code, faster builds, and the same telemetry. This blog explains how we got there. You know the feeling when you spend years building tooling that supports something and all of a sudden that something becomes deprecated and you have to rethink your full approach?

Read Post

Sentry

Read more about Less code, faster builds, same telemetry: Turbopack support for the Next.js SDK

The SRE Report 2026: Why fast is what users trust

Jan 29, 2026 By Leo Vasiliou In Catchpoint

Reliability used to mean “are we up?” Today, customers ask something more demanding: “Are you fast, everywhere, every time?” The SRE Report 2026 shows that this change is no longer emerging. It is already established.

Read Post

Catchpoint

Read more about The SRE Report 2026: Why fast is what users trust

New TraceQL features in Tempo 2.10 | Grafana

Jan 29, 2026 By Grafana In Grafana

In this video, Tiffany Jernigan walks through new TraceQL features in Tempo 2.10 that can help you analyze trace structure, identify traces with attributes that don't exist, and more.

View Video

Grafana

Read more about New TraceQL features in Tempo 2.10 | Grafana

We're Past Human-Scale Operations. Here's Why.

Jan 29, 2026 By Virtana In Virtana

Ever been on a 100-person P1 call where everyone says, “It’s not us”? That’s not a people problem. It’s a broken operating model. More tools. More data. More teams. And somehow… slower resolution. This is what happens when observability is fragmented across silos. Each team has data, but no one has shared truth—and human-scale operations can’t keep up with modern IT complexity. This clip breaks down why the old model no longer works.

View Video

Virtana

Read more about We're Past Human-Scale Operations. Here's Why.

Navigating the Signal Tsunami: Why Shared Observability Matters

Jan 29, 2026 By Shannon Kalvar In OpsRamp

Digital businesses today generate a flood of telemetry—metrics, logs, traces, and events—at a scale that grows exponentially with every new application, cloud service, and user interaction. In one recent IDC survey, every organization reported sharing observability data across teams, yet nearly half said poor collaboration still prevents them from identifying performance problems.

Read Post

OpsRamp

Read more about Navigating the Signal Tsunami: Why Shared Observability Matters

How AIOps Event Correlation Transforms Incident Response

Jan 29, 2026 By Renuka Suresh In HEAL Software

IT Operations Leaders, Platform Engineering Managers, SRE Team Leads, and DevOps Directors managing complex, multi-tool observability environments who are struggling with alert overload and extended incident resolution times.

Read Post

HEAL Software

Read more about How AIOps Event Correlation Transforms Incident Response

Debugging AI Agents in Production Without Losing Your Mind

Jan 29, 2026 By SigNoz - Open Source Observability Platform In SigNoz

AI agents are powerful, but debugging them in production is hard. Non-deterministic behavior, LLM latency, and token costs create observability challenges that traditional monitoring tools don't address. In this webinar, engineers from Inkeep and SigNoz walk through how Inkeep monitors its AI agent framework in production using OpenTelemetry-native observability.

View Video

SigNoz

Read more about Debugging AI Agents in Production Without Losing Your Mind

6 Common Factors That Influence Fleet Safety Program Success

Jan 29, 2026 By OpsMatters In OpsMatters

Building a safer fleet is not about one silver bullet. It is a set of practical choices that add up, day after day, until safer habits and smarter tools become the way you operate. This article breaks the work into six factors you can act on. Each one is designed to be simple to start, measurable to manage, and durable enough to last when operations get busy.

Read Post

OpsMatters

Read more about 6 Common Factors That Influence Fleet Safety Program Success

Now available: More monitor history

Jan 28, 2026 By Valeria Kurolapova In StatusGator

We’re excited to roll out an improvement many of you have been asking for: extended historical metrics for website and ping monitors. Until now, monitor metrics like availability, downtime, and response times were limited to the last 24 hours. While useful for short-term checks, this made it harder to spot trends, investigate intermittent issues, or understand long-term performance. That changes today.

Read Post

StatusGator

Read more about Now available: More monitor history

Why Context, Not Prompts, Determines AI Agent Performance

Jan 28, 2026 By Margo Poda In LogicMonitor

Prompt engineering improves single responses, but agent performance is determined by how execution context is captured, replayed, and constrained over time. For the past few years, enterprises have obsessed over prompts, with entire roles emerging around their design and an ecosystem of tooling and templates following close behind. This focus delivered early gains because it allowed teams to rapidly improve outputs without modifying the surrounding system. Over time, those gains flattened.

Read Post

LogicMonitor

Read more about Why Context, Not Prompts, Determines AI Agent Performance

Stop Sifting Logs: Find Production Errors in Seconds with `severity=error`

Jan 28, 2026 By Connor James In AppSignal

Want your log queries to be more precise? Is your vibe code flooding you with logs and need a helping hand to make sense of it all? Good news! We've upgraded our log query language to be more powerful, flexible, and intuitive, letting you focus on finding answers fast rather than endlessly scrolling through your logs. And that's not all: We've revamped our logging interface, making it easier than ever to manage logs, customize views, and leverage log attributes.

Read Post

AppSignal

Read more about Stop Sifting Logs: Find Production Errors in Seconds with `severity=error`

When DIY Becomes a Network Liability

Jan 28, 2026 By Yann Guernion In Broadcom

There is a satisfaction in building things yourself. It is the same psychological hook that powers the endless stream of DIY renovation videos on your social media feeds. You watch a sixty-second clip of someone transforming a pile of lumber into a custom coffee table, and it looks ingenious, cost-effective, and uniquely tailored to their needs. It triggers a powerful "why buy when I can build?" mindset.

Read Post

Broadcom

Read more about When DIY Becomes a Network Liability

Datadog acquires Propolis

Jan 28, 2026 By Hugo Kaczmarek In Datadog

Generative AI enables teams to write and ship code faster than ever. But current methods for testing and quality assurance have not evolved to match the new pace and scale of deployments. Manual and deterministic testing paths quickly become obsolete when new features are released, and they fundamentally can’t test AI outputs, leaving a massive untested surface area. To keep up, teams need new testing methods that can define what goals users have, and ensure that their outcomes match.

Read Post

Datadog

Read more about Datadog acquires Propolis

Domain Health Check: Why It Matters and What It Reveals

Jan 28, 2026 By Dotcom-Monitor In Dotcom-Monitor

Your domain is more than a URL- it’s the control plane for how people (and machines) reach your website, apps, and inbox. When something breaks at the domain layer, the symptoms look “random” (site intermittently down, emails bouncing, logins failing), but the root cause is often predictable: misconfigurations, weak authentication, or degraded DNS performance. A domain health check is the fastest way to surface those issues before customers do.

Read Post

Dotcom-Monitor

Read more about Domain Health Check: Why It Matters and What It Reveals

Taming Atlassian Audit Logs: Processing messy JSON to enable operational insights

Jan 28, 2026 By Graylog In Graylog

Atlassian’s audit records are data-rich, but messy. In this data-driven deep dive, Eddy Gurney from NetScout shares what it took to get them into Graylog. He walks through four pipeline approaches and why each fell short, then shows how moving parsing to the edge with Filebeat unlocked Graylog. With clean, flattened events flowing in, alerts and dashboards turn “noise” into operational visibility. You’ll also see how Sidecars makes config rollout easy, plus what changes to make if you’re on Atlassian Cloud instead of Data Center.

View Video

Graylog

Read more about Taming Atlassian Audit Logs: Processing messy JSON to enable operational insights

Event context, tags, logs and metrics | Debugging Next.js Applications with Sentry

Jan 28, 2026 By Sentry In Sentry

Adding additional information to issues captured in Sentry can help you identify and prioritize your most critical issues. Logs and Metrics help build context around the error and understand correlation and causation all in one place due to everything being trace connected.

View Video

Sentry

Monitoring

Read more about Event context, tags, logs and metrics | Debugging Next.js Applications with Sentry

Log Drains Now Available: Bringing Your Platform Logs Directly Into Sentry

Jan 28, 2026 By Allison Rogers In Sentry

Sentry now supports log drains, making it easy to forward logs into Sentry without any application code changes or manual project-key lookups needed. If your logs already exist somewhere else, you can now see them alongside errors and traces in Sentry, no code changes required. Already want to get started? The quickstart guide is one click away.

Read Post

Sentry

Read more about Log Drains Now Available: Bringing Your Platform Logs Directly Into Sentry

Introducing Obkio's Remote User Monitoring Plan: For Distributed Workforces

Jan 28, 2026 By Alyssa Lamberti In Obkio

The way we work has fundamentally changed. Remote and hybrid work aren't temporary shifts; they're the new reality for most organizations. And with that reality comes a challenge IT teams know all too well: how do you troubleshoot network issues for users you can't physically reach?

Read Post

Obkio

Read more about Introducing Obkio's Remote User Monitoring Plan: For Distributed Workforces

From Atlassian JSON to Actionable Audit Insights

Jan 28, 2026 By Jeff Darrington In Graylog

Atlassian audit logs carry high-value security and operational signals, yet the raw format makes them hard to use in day-to-day investigations. Nested JSON, arrays inside arrays, and localization keys turn routine questions into slow, manual work. For lean Security and IT teams, that friction shows up as delayed triage, fragile dashboards, and alerts that fire without enough context to act.

Read Post

Graylog

Read more about From Atlassian JSON to Actionable Audit Insights

From Ukraine to the Cloud: Stories of IPv4 Migration

Jan 28, 2026 By Doug Madory In Kentik

This post expands on our analysis from last year that revealed that as much as 20% of IPv4 space has migrated out of Ukraine in the years following the Russian invasion in February 2022. This update reveals that AT&T (a popular destination for Ukrainian IPs) has since implemented a policy ridding itself of customers using AS7018 to originate their routes, often to support residential proxies.

Read Post

Kentik

Read more about From Ukraine to the Cloud: Stories of IPv4 Migration

Scaling AI Reliability: Real world lessons from Mistral AI

Jan 28, 2026 By Checkly In Checkly

How does one of the world's leading AI companies keep its infrastructure reliable while shipping new models constantly? In this webinar, Devon Mizelle, Senior SRE at Mistral AI, shares the real story. Devon walks through how Mistral built an automated system that generates synthetic checks for every model the moment it goes live—no manual configuration, no forgotten monitors, no inconsistent alerting. Using monitoring as code, his team eliminated the toil of maintaining hundreds of checks across a rapidly evolving model ecosystem.

View Video

Checkly

Read more about Scaling AI Reliability: Real world lessons from Mistral AI

How to Create an SNMP Poller in SolarWinds Observability Self-Hosted

Jan 28, 2026 By solarwindsinc In SolarWinds

SolarWinds technical trainer Cheryl Nomanson presents a systematic approach to optimizing and building custom SNMP pollers. The tutorial walks through a step-by-step process starting with adding devices for SNMP monitoring using default pollers, then identifying missing metrics and checking if the required OIDs exist. If OIDs don't exist, she explains how to use alternative OIDs or data transformation tools.

View Video

SolarWinds

Read more about How to Create an SNMP Poller in SolarWinds Observability Self-Hosted

How Alerting Works in SolarWinds Observability Self-Hosted

Jan 28, 2026 By solarwindsinc In SolarWinds

This training video from SolarWinds Academy provides a high-level overview of how the alerting process works within SolarWinds software. Technical trainer Cheryl Nomanson explains the step-by-step workflow, starting with the alerting engine continuously scanning the database for conditions that meet alert trigger thresholds. She covers how triggered elements are evaluated for suppressions (like time-of-day restrictions and scoping), and explains that only fully qualified conditions become actual alerts. The video details how alerts always display in the web console and may trigger additional actions like emails or scripts.

View Video

SolarWinds

Read more about How Alerting Works in SolarWinds Observability Self-Hosted

Networking Technology Trends for 2026

Jan 28, 2026 By Nolan Greene In Auvik

From an IT pro’s perspective, the future of networking technology in 2026 is a mixed bag of potential and security risk. New wireless tech, agentic AI, and the increased distribution of networks are enabling new use cases and helping automate toil, but they also create new attack surfaces and risk profiles. In this article, we’ll take a look at the ten network security trends we’re most excited about in 2026 and provide key insights about what each one means for IT and MSP teams.

Read Post

Auvik

Read more about Networking Technology Trends for 2026

The Incident Checklist: Reducing Cognitive Load When It Matters Most

Jan 28, 2026 By James Barnes In StatusCake

In the previous post, we looked at what happens after detection; when incidents stop being purely technical problems and become human ones, with cognitive load as the real constraint. This post assumes that context. The question here is simpler and more practical. What actually helps teams think clearly and act well once things are already going wrong? One answer, used quietly but consistently by high-performing teams, is the checklist.

Read Post

StatusCake

Read more about The Incident Checklist: Reducing Cognitive Load When It Matters Most

Integrating Prometheus Metrics into Icinga Using check_prometheus

Jan 28, 2026 By Julian Brost In Icinga

This article explains how to integrate metrics from Prometheus into Icinga checks using the check_prometheus plugin. There can be multiple reasons why this could be desired: Maybe you have different teams with their own monitoring systems, and you need to bridge the gap, or you want to perform queries that are just better expressed in Prometheus than in plain Icinga check plugins. The latter can be the case if you want to aggregate data from multiple sources or you want to take historic data into account.

Read Post

Icinga

Read more about Integrating Prometheus Metrics into Icinga Using check_prometheus

AI is not intelligent. It's obedient.

Jan 27, 2026 By Alsherin In ManageEngine

Tech companies and brands love calling AI “intelligent.” But is it really? AI doesn’t decide what matters. Humans do. We decide what’s important, then feed prompts, data, and instructions into AI models so they work the way they do. At the end of the day, AI is obedient to human intelligence, not the other way around. And it’s on us to use it in ways that actually matter, instead of dismissing it or freaking out that it’s going to replace humans.

Read Post

ManageEngine

Read more about AI is not intelligent. It's obedient.

Three-quarters of UK IT teams suffer outages due to missing critical alerts

Jan 27, 2026 By Splunk In Splunk

Confronted with issues like tool sprawl and excessive alerts, UK teams report higher-than-average rates of alert fatigue and burnout.

Read Post

Splunk

Read more about Three-quarters of UK IT teams suffer outages due to missing critical alerts

Why Does Digital Employee Experience (DEX) Matter for Business Outcomes?

Jan 27, 2026 By Nexthink In Nexthink

A single disengaged employee can cost an organization approximately $2,246 annually. In today’s technology-driven workplaces, that cost increasingly comes from technology friction – the everyday delays, disruptions, and inefficiencies caused by poorly performing tools. “There’s not really any businesses today that are not fundamentally technology businesses now.”⁠⁠Dan Anthony, CIO at FedNow.

Read Post

Nexthink

Read more about Why Does Digital Employee Experience (DEX) Matter for Business Outcomes?

AI Is WAY More Expensive Than You Think... | SolarWinds TechPod #105

Jan 27, 2026 By solarwindsinc In SolarWinds

Artificial intelligence isn’t just about innovation and efficiency — it comes with hidden costs. From massive data centers and rising energy consumption to layoffs, governance, and long-term business impact, the real price of AI is often ignored. Companies rush to adopt AI, but are they calculating the true cost for the environment and their bottom line?

View Video

SolarWinds

Read more about AI Is WAY More Expensive Than You Think... | SolarWinds TechPod #105

Business intelligence plugins for Grafana: what's next

Jan 27, 2026 By David Harris In Grafana

Volkov Labs has been a longtime partner to Grafana Labs, with co-founder Mikhail Volkov contributing to Grafana in the early stages of the OSS project. On Sept. 26, the Florida-based company that recently created a suite of business intelligence (BI) plugins for Grafana announced it had been acquired. In light of the news, Grafana Labs committed to taking over the maintenance and development of their popular business intelligence (BI) plugin suite.

Read Post

Grafana

Read more about Business intelligence plugins for Grafana: what's next

Take Back Control of Your Observability Spend

Jan 27, 2026 By Mezmo In Mezmo

As budgets reset for 2026, engineering leaders are making a resolution: no more vendor lock-in. Here’s how to keep that promise by building on the technical foundations of data reliability and simplified collection. It’s January 2026, and if you’re like most engineering leaders, you’re staring at your observability vendor contracts with a mix of frustration and resignation.

Read Post

Mezmo

Read more about Take Back Control of Your Observability Spend

Session Replay | Debugging Next.js Applications with Sentry

Jan 27, 2026 By Sentry In Sentry

Session Replay lets you see how your users experienced your Next.js application before a crash happened. Aside from how the user used your app, it also captures the console output of the browser, the network requests, and the memory snapshot, so you get all the information needed to debug the issue. In this video you’ll learn how to use Session Replay and implement it in your Next.js application.

View Video

Sentry

Read more about Session Replay | Debugging Next.js Applications with Sentry

Getting Started with Seer - Sentry's AI Debugging Agent

Jan 27, 2026 By Sentry In Sentry

Seer is Sentry's AI Debugging agent that has access to all the context that Sentry pulls together from your applications. Sometimes it shows up predicting bugs before they ship to prod. Sometimes it's catching issues in prod and bringing you the fix. Seer pulls from distributed traces, logs, profiles, stack traces, errors, and your codebase, and helps you find the broken parts of your application and fix them faster.

View Video

Sentry

Read more about Getting Started with Seer - Sentry's AI Debugging Agent

Reality Bytes: Nexthink Drops Spark!! Big News (+ Emotions)

Jan 27, 2026 By Nexthink In Nexthink

The full panel comes together to mark Tim Flower’s final appearance on Reality Bytes, reflecting on his impact, insight, and anchoring presence over the years. Alongside the goodbyes, the conversation turns to a landmark moment for Nexthink: the release of Spark. Framed as a pivotal shift in the capabilities of digital employee experience, Spark is explored through real-world stories and personal takes, including how it empowers employees, reduces IT friction, and redefines support.

View Video

Nexthink

Read more about Reality Bytes: Nexthink Drops Spark!! Big News (+ Emotions)

The 2026 IT Leader's Priority Shift: Why AI, Resilience, and Visibility Now Outrank Everything Else

Jan 27, 2026 By Sofia Burton In LogicMonitor

IT leaders are replacing traditional focuses with three things that now outrank everything else: AI readiness, operational resilience, and unified visibility. You can’t add another priority to the list. There’s no space left. Your team is already stretched managing hybrid infrastructure, responding to incidents, juggling tool sprawl, and delivering on AI promises while keeping costs under control.

Read Post

LogicMonitor

Read more about The 2026 IT Leader's Priority Shift: Why AI, Resilience, and Visibility Now Outrank Everything Else

Why ITOps Automation Is Hard, Until You Change Your Approach

Jan 27, 2026 By Margo Poda In LogicMonitor

Automation fails in ITOps because it’s treated as a local efficiency gain rather than a system-level change—an approach that breaks down at scale as AI raises the bar for context, ownership, and control. Modern ITOps environments are hybrid, distributed, and assembled from overlapping vendors and platforms. Services run across clouds and teams. Signals arrive continuously. Dependencies change faster than they can be documented.

Read Post

LogicMonitor

Read more about Why ITOps Automation Is Hard, Until You Change Your Approach

Kubernetes Logging Best Practices

Jan 27, 2026 By Jeff Darrington In Graylog

You’re sitting at your desk, typing away, when all of a sudden you hear a “ping!” Unfortunately, you have a browser with fifteen tabs open, a task management application, email, messaging applications, and calendars all open, making it difficult to know exactly which technology just pinged you. To identify the source, you open your system settings and look at the notifications section to see which ones you allow to make a sound.

Read Post

Graylog

Read more about Kubernetes Logging Best Practices

Getting Started with InfluxDB and Pandas: A Beginner's Guide

Jan 27, 2026 By Anais Dotis-Georgiou In InfluxData

InfluxData prides itself on prioritizing developer happiness. A key ingredient to that formula is providing client libraries that let users interact with the database in their chosen language and library. Data analysis is the task most broadly associated with Python use cases, accounting for 58% of Python tasks, so it makes sense that Pandas is the second most popular library for Python users.

Read Post

InfluxData

Read more about Getting Started with InfluxDB and Pandas: A Beginner's Guide

What API Performance Monitoring Looks Like in Real Production Environments

Jan 27, 2026 By Dotcom-Monitor In Dotcom-Monitor

API performance monitoring has become a critical discipline for modern engineering teams, but most conversations around it stop at metrics, dashboards, and testing tools. Teams measure response time, track error rates, and run performance tests before release, yet APIs still slow down, silently fail, or violate SLAs in production. The problem isn’t a lack of monitoring. It’s a mismatch between how APIs are tested and how they actually behave in the real world.

Read Post

Dotcom-Monitor

Read more about What API Performance Monitoring Looks Like in Real Production Environments

Part Two: Turning Event Intelligence into Action - Real-World Value for Financial Enterprises

Jan 27, 2026 By david.arrowsmith In Interlink

Event Intelligence Solutions are redefining how organizations manage complexity and risk across digital ecosystems. Their true power lies not only in detecting anomalies or suppressing noise, but in providing actionable, explainable intelligence that connects IT events to business impact.

Read Post

Interlink

Read more about Part Two: Turning Event Intelligence into Action - Real-World Value for Financial Enterprises

Seer: debug with AI at every stage of development

Jan 27, 2026 By Indragie Karunaratne In Sentry

When we launched Seer, our AI debugging agent, we built it on a core belief: production context is essential for understanding the complex failure modes of real-world software. Seer uses the detailed telemetry that Sentry collects (errors, spans, logs, metrics, and more) to accurately root cause and fix bugs. Because this telemetry is trace-connected, Seer can deterministically traverse all the data relevant to a problem rather than relying exclusively on imprecise time-range searches.

Read Post

Sentry

Read more about Seer: debug with AI at every stage of development

Top 12 Distributed Tracing Tools in 2026: Complete Comparison & Reviews

Jan 27, 2026 By Sematext In Sematext

Distributed tracing has become essential for modern software teams. As applications evolve into complex distributed systems with microservices, APIs, databases, and third-party integrations, understanding how a single user request travels through your entire stack is no longer optional, it’s critical for maintaining performance, reliability, and user satisfaction.

Read Post

Sematext

Read more about Top 12 Distributed Tracing Tools in 2026: Complete Comparison & Reviews

Bindplane + Statsig Integration: Unified Telemetry for Product Metrics and Experimentation

Jan 27, 2026 By Adnan Rahic In ObservIQ

We’re excited to announce a new integration between Bindplane and Statsig, making it easier to collect, process, and route OpenTelemetry signals into Statsig at scale. This integration provides a seamless way to connect Statsig with the OpenTelemetry ecosystem using Bindplane’s vendor-neutral, OpenTelemetry-native telemetry pipeline. Focus on product insight, not collector operations.

Read Post

ObservIQ

Read more about Bindplane + Statsig Integration: Unified Telemetry for Product Metrics and Experimentation

Introducing IsDown MCP

Jan 26, 2026 By Nuno Tomas In isDown

We're excited to announce a new way to interact with IsDown: the MCP Server. Now you can query your monitoring data directly from AI assistants like Claude.

Read Post

isDown

Read more about Introducing IsDown MCP

How Observability Cuts IT Costs? [7 Proven Ways to Reduce Infra, Storage and Operational Spend for 2026]

Jan 26, 2026 By Pavithra Parthiban In Atatus

IT budgets are getting squeezed, yet teams are expected to deliver faster releases, higher reliability and tighter security. Observability has become one of the few levers that directly influences IT cost reduction because it gives teams the ability to understand exactly what’s consuming resources, wasting storage, dragging performance, and inflating operational workload. In this guide, you’ll learn seven evidence-backed strategies that leading engineering teams use to cut expenditure.

Read Post

Atatus

Read more about How Observability Cuts IT Costs? [7 Proven Ways to Reduce Infra, Storage and Operational Spend for 2026]

How to Build a Great Knowledge Base in Notion

Jan 26, 2026 By Super Monitoring In Super Monitoring

What a Knowledge Base Is and Why You Need One? Prior to handling the Notion, let us make one point crystal clear: A knowledge base is simply one trusted location where information lives. It provides answers to questions such as: A good knowledge base: Notion is a very good tool because: Through this article, it’s going to demonstrate how you can create a clean, simple, and functional knowledge base with a current version of Notion.

Read Post

Super Monitoring

Read more about How to Build a Great Knowledge Base in Notion

Clustered Directors, Pipeline Debugging, and More Integrations

Jan 26, 2026 By VirtualMetric In VirtualMetric

Over the past two months, VirtualMetric DataStream delivered a substantial update cycle focused on resilience, productivity, and platform extensibility. This release strengthens the core architecture, makes pipeline development and troubleshooting significantly easier, and expands integration coverage across schemas, SIEMs, and cloud platforms. Let’s take a closer look.

Read Post

VirtualMetric

Read more about Clustered Directors, Pipeline Debugging, and More Integrations

API Monitoring: Metrics, Best Practices, Tools, and Setup Playbooks

Jan 26, 2026 By Dotcom-Monitor In Dotcom-Monitor

Modern systems rarely fail in obvious ways. An API might slow down in one region, return subtly incorrect data after a : deploy, or degrade only under specific traffic patterns. By the time users report the issue, it has often already impacted reliability, revenue, or trust. This is why API monitoring has evolved from a simple uptime check into a core production discipline.

Read Post

Dotcom-Monitor

Read more about API Monitoring: Metrics, Best Practices, Tools, and Setup Playbooks

Healthcare IT Trends to Know Before 2026

Jan 26, 2026 By Rebecca Grassing In Auvik

Healthcare technology is evolving at a pace that would’ve seemed impossible just a few years ago. From smart hospitals and connected medical devices to AI-powered diagnostics and remote patient monitoring, digital innovation is shifting how care is delivered and how healthcare IT teams operate. The next wave of healthcare IT trends will push infrastructure, security, and data systems further than ever before.

Read Post

Auvik

Read more about Healthcare IT Trends to Know Before 2026

Cribl Insights Demo

Jan 26, 2026 By Cribl In Cribl

Get instant visibility into how data moves through your Cribl environment with Cribl Insights. In this demo, Product Manager Pavan Venkatesh walks through how Cribl Insights helps you understand system health, data flows, and how to set up alerts and notifications —so you can proactively troubleshoot faster and optimize resources to get max ROI.

View Video

Cribl

Read more about Cribl Insights Demo

Top 25 Web Application Monitoring Tools (2026 Edition)

Jan 26, 2026 By Dotcom-Monitor In Dotcom-Monitor

In today’s fast-paced digital world, web application monitoring tools are no longer a luxury but a necessity for maintaining robust, high-performing online services. Whether you’re running an e-commerce giant, a SaaS platform, or a critical internal application, understanding your application’s health and user experience is paramount.

Read Post

Dotcom-Monitor

Read more about Top 25 Web Application Monitoring Tools (2026 Edition)

Migrating from PRTG to WhatsUp Gold: The Ultimate Guide

Jan 26, 2026 By Jason Alberino In WhatsUp Gold

Mirating from PRTG to WhatsUp Gold can feel daunting, but with the right approach, it’s a smooth transition that unlocks powerful monitoring capabilities and a simplified user experience. WhatsUp Gold offers intuitive dashboards, flexible licensing, and advanced features like Network Traffic Analysis, Application Monitoring, and integrated Network Detection & Response (NDR) for comprehensive visibility across hybrid environments.

Read Post

WhatsUp Gold

Read more about Migrating from PRTG to WhatsUp Gold: The Ultimate Guide

Actionable Network Device Monitoring with Automated Anomaly Detection and AI Troubleshooting

Jan 26, 2026 By Netdata In netdata

Network device monitoring is often a mess of polling, graphs, and alerts that don't lead to answers. In this webinar, we'll show how to monitor routers, switches, and firewalls in a way that quickly surfaces what matters: interface health, errors, drops, saturation, latency signals, and performance regressions—without drowning in noise. You'll learn how Netdata turns raw SNMP metrics into high-signal insights using automated anomaly detection and AI-assisted troubleshooting, so your team can move from 'something is wrong' to 'here's the root cause' faster.

View Video

netdata

Read more about Actionable Network Device Monitoring with Automated Anomaly Detection and AI Troubleshooting

API Observability: Why Outside-In Signals Are Still Essential

Jan 25, 2026 By Dotcom-Monitor In Dotcom-Monitor

API observability has become a go-to goal for modern engineering teams. As architectures shift to microservices and APIs become the backbone of products, teams need a reliable way to understand what’s happening across services, before issues turn into incidents. That’s where observability comes in: collect the right signals, connect the dots, and debug faster.

Read Post

Dotcom-Monitor

Read more about API Observability: Why Outside-In Signals Are Still Essential

GenAI Observability in Grafana Cloud: End-to-End Agent Debugging (Demo)

Jan 25, 2026 By Grafana In Grafana

From Observability for GenAI Applications (Grafana OpenTelemetry Community Call) We drill into traces to see which agents called which tools, where errors occurred, how long each LLM call took, and how costs and tokens are distributed. The walkthrough also covers using AI assistance to summarize long traces and identify optimization opportunities in real time..

View Video

Grafana

Read more about GenAI Observability in Grafana Cloud: End-to-End Agent Debugging (Demo)

Evaluating LLM Agents in the Real World: Inside Grafana Assistant

Jan 25, 2026 By Grafana In Grafana

From Observability for GenAI Applications (Grafana OpenTelemetry Community Call)

View Video

Grafana

Read more about Evaluating LLM Agents in the Real World: Inside Grafana Assistant

Introducing System Datasets: Observing the Observability Platform

Jan 25, 2026 By Coralogix Team In Coralogix

Modern observability platforms are great at explaining what’s happening in your apps and your infrastructure. However, all too often the observability platform itself remains a black box. As observability data and usage grow, governance almost always lags behind, and teams struggle to answer basic operational questions like: This valuable data is typically fragmented across admin UIs, billing pages, support tickets, and tribal knowledge.

Read Post

Coralogix

Read more about Introducing System Datasets: Observing the Observability Platform

SQL performance improvements: automatic detection & regression testing (part 3)

Jan 25, 2026 By Mattias Geniar In Oh Dear

This is the final part of our 3-part series on SQL performance improvements. In part 1, we covered how to identify slow queries. In part 2, we explored how to fix them with indexes. In this post, we'll share how we prevent those performance issues from ever reaching production again. A few weeks ago, we massively improved the performance of the dashboard & website by optimizing our SQL queries.

Read Post

Oh Dear

Read more about SQL performance improvements: automatic detection & regression testing (part 3)

Monitor groups are now supported in the API

Jan 24, 2026 By Valeria Kurolapova In StatusGator

We recently launched monitor groups, making it easier to organize monitors on your boards and status pages. Now that same functionality is available in the StatusGator API, so you can manage monitor groups programmatically. The API now supports listing, creating, updating, and deleting monitor groups on a board. You can also assign or remove monitors from groups when creating or updating a monitor.

Read Post

StatusGator

Read more about Monitor groups are now supported in the API

Best DNS Monitoring Tools in 2026

Jan 24, 2026 By Dotcom-Monitor In Dotcom-Monitor

DNS monitoring is the practice of continuously checking that your domain names resolve correctly (right records, right answers) and that DNS lookups are fast and reliable from multiple locations. Depending on the tool, it can also watch for unexpected DNS record changes (A/AAAA/CNAME/MX/NS/TXT, etc.), validate DNSSEC, and pinpoint where resolution breaks in the chain.

Read Post

Dotcom-Monitor

Read more about Best DNS Monitoring Tools in 2026

AI Is Bigger Than LLMs: Why Network Teams Need to Think Beyond Chatbots and Agents

Jan 23, 2026 By Phil Gervasi In Kentik

AI in network operations is more than chatbots and agents. LLMs make AI easier to use, but the real value comes from the underlying system of telemetry, data pipelines, analytics, ML models, domain knowledge, and workflows that help engineers reason, predict, and act. When designed thoughtfully, AI doesn’t replace engineers. Instead, it augments their expertise and reduces cognitive load across complex network operations.

Read Post

Kentik

Read more about AI Is Bigger Than LLMs: Why Network Teams Need to Think Beyond Chatbots and Agents

Building a synthetic monitoring solution for Jaeger with Grafana k6

Jan 23, 2026 By Wilfried Roset In Grafana

Wilfried Roset is an engineering manager who leads an SRE team and he is a Grafana Champion. Wilfried currently works at OVHcloud where he focuses on prioritizing sustainability, resilience, and industrialization to guarantee customer satisfaction. As an SRE Engineering Manager and a Grafana Champion, I believe a resilient and sustainable cloud experience begins with strong observability.

Read Post

Grafana

Read more about Building a synthetic monitoring solution for Jaeger with Grafana k6

k8s-monitoring-helm Chart Office Hours (January 2026)

Jan 23, 2026 By Grafana In Grafana

In the January edition of the Kubernetes Monitoring Helm chart office hours, we discuss the version 3.7 release, the upcoming 3.7 and 4.0 releases and features, and we discuss the upcoming deprecation of the 1.x and 2.0 versions.

View Video

Grafana

Read more about k8s-monitoring-helm Chart Office Hours (January 2026)

Uptime.com Real User Monitoring Report

Jan 23, 2026 By Uptime Website Monitoring In uptime

Take an in-depth tour of the Uptime.com RUM report. Comprehensively understand your users – and your baselines. Organize RUM data by URL(s) or group URL(s) to track subdomains; segment data by devices, operating systems, browsers, countries, other geographies – to compare metrics within specific time windows to your website or application’s performance monitoring baselines.

View Video

uptime

Read more about Uptime.com Real User Monitoring Report

API Uptime Monitoring Explained: How to Measure True API Availability in Production

Jan 23, 2026 By Dotcom-Monitor In Dotcom-Monitor

For many teams, API uptime monitoring still means one simple thing: checking whether an endpoint responds with a 200 OK. If the check passes, the API is marked as “up.” If it fails, an alert is triggered. On paper, that sounds reasonable. In practice, it’s one of the most common reasons API outages go unnoticed until users complain. The problem is that modern APIs are no longer simple, stateless endpoints.

Read Post

Dotcom-Monitor

Read more about API Uptime Monitoring Explained: How to Measure True API Availability in Production

Cribl Notebooks Demo Powered by Lakehouse

Jan 23, 2026 By Cribl In Cribl

In this demo, Product Manager David Cavuto walks through how Cribl Notebooks helps security and IT teams investigate faster by combining live data, historical context, and interactive analysis — all in one place.

View Video

Cribl

Read more about Cribl Notebooks Demo Powered by Lakehouse

AI in Production Is Growing Faster Than We Can Trust it

Jan 23, 2026 By Fahim Zaman In Honeycomb

Enterprise software has moved past the generative AI testing phase. Businesses with millions of daily users or workloads are no longer just prototyping LLMs in a vacuum. They’re directly wiring agentic efficiency into product interfaces and infrastructure to stay competitive. This wave is often compared to the spread of microservices in the past, but we aren’t just adding new dependencies and complexity.

Read Post

Honeycomb

Read more about AI in Production Is Growing Faster Than We Can Trust it

Stop Flying Blind: Synthetic Monitoring, Host heat-maps, and Process-Level Visibility

Jan 23, 2026 By Nishant Modak In Last9

January 2026 Release Here's a dirty secret about observability: most teams find out about outages from their customers. Not from their dashboards. Not from their alerts. From angry tweets and support tickets. The excuse is always the same: "We have metrics! We have dashboards! We even have that AI thing now!" And yet, somehow, your checkout endpoint has been returning 502s for forty-five minutes and you're learning about it from the VP of Sales who just got off a call with your biggest customer.

Read Post

Last9

Read more about Stop Flying Blind: Synthetic Monitoring, Host heat-maps, and Process-Level Visibility

Helping Service Providers Build Future-Ready Autonomous Networks

Jan 23, 2026 By Splunk In Splunk

As network complexity scales, Splunk empowers service providers to transition toward autonomous networking by integrating automated monitoring with AI-driven root-cause analysis. By shifting from reactive troubleshooting to proactive, automated remediation, providers can resolve issues before they impact the user experience. This evolution ensures seamless digital connectivity while simultaneously reducing customer churn and the high costs of manual network maintenance.

View Video

Splunk

Read more about Helping Service Providers Build Future-Ready Autonomous Networks

Observability Trends to Watch in 2026 - Cutting Down On Signal Noise

Jan 23, 2026 By Splunk In Splunk

Cut down on noisy telemetry data and move toward high signal density.

View Video

Splunk

Read more about Observability Trends to Watch in 2026 - Cutting Down On Signal Noise

Heartbeat behind the metrics | Jasper on why availability will never stop mattering

Jan 23, 2026 By ManageEngine Site24x7 In Site24x7

What does it take to build a monitoring platform that teams rely on every single day? In this episode of Heartbeat Behind the Metrics, Jamesraj Paul Jasper, Principal Product Manager of Site24x7, talks about his 15-year journey with the product and the moments that still stand out. He dives into why APM Insights is closest to his heart, and also shares a proud team moment where a complex enterprise feature was designed, built, and shipped in just two weeks through tight coordination.

View Video

Site24x7

Monitoring

Read more about Heartbeat behind the metrics | Jasper on why availability will never stop mattering

Top Education Technology Trends to Watch Through 2026

Jan 23, 2026 By Rebecca Grassing In Auvik

The education technology landscape is entering a period of consolidation and integration. Schools are moving past the online learning experimentation phase of recent years and focusing on technologies that deliver measurable improvements in teaching and learning outcomes. For IT professionals managing educational networks, understanding these shifts helps prioritize infrastructure investments and security protocols.

Read Post

Auvik

Read more about Top Education Technology Trends to Watch Through 2026

LLM Providers Status Report - December 2025

Jan 22, 2026 By Nuno Tomas In isDown

This report covers the operational status of major AI systems during December 2025, including Anthropic, Cohere, DeepSeek, Google Gemini, Groq Cloud, OpenAI, Perplexity, Replicate, and xAI. The data encompasses official incidents reported by providers and monitoring insights from IsDown's detection system.

Read Post

isDown

Read more about LLM Providers Status Report - December 2025

Sponsored Post

Breaking Down IT Silos with OpManager Plus's Full-stack observability

Jan 22, 2026 By Sandhya Saravanan In ManageEngine

In today's complex and dynamic IT landscape, a single application relies on dozens of interconnected services, from physical servers to virtual machines, cloud instances, and third-party APIs. When something goes wrong, a traditional monitoring approach that focuses on individual components is no longer enough. This is where full-stack observability becomes critical. It's the ability to gain a holistic, real-time understanding of your entire technology stack, from the user experience all the way down to the underlying network infrastructure.

Read Post

ManageEngine

Read more about Breaking Down IT Silos with OpManager Plus's Full-stack observability

Observability That Works: Understand System Failures and Drive Better Business Outcomes

Jan 22, 2026 By Stephen Watts In Splunk

Modern systems don't fail because engineers lack skills; they fail because teams can't see why systems are failing at all or can’t see why they’re failing fast enough. Often, the problem isn't a lack of tools — it's a lack of clear, connected visibility across data, teams, and systems. This is where observability transforms how organizations operate. It's no longer just about keeping systems running.

Read Post

Splunk

Read more about Observability That Works: Understand System Failures and Drive Better Business Outcomes

Unify and correlate frontend and backend data with retention filters

Jan 22, 2026 By Stella Ma In Datadog

Teams can use Datadog Real User Monitoring (RUM) and RUM without Limits to get full visibility into the frontend health of their applications while retaining only the sessions that contain critical problems that affect the end-user experience. But application errors or slowness often result from backend issues, such as database bottlenecks. To diagnose these issues, you need to correlate the frontend data from RUM with the backend data from Datadog Application Performance Monitoring (APM).

Read Post

Datadog

Read more about Unify and correlate frontend and backend data with retention filters

Understanding Lighthouse: Largest Contentful Paint

Jan 22, 2026 By Todd H. Gardner In Request Metrics

Your hero image takes 5 seconds to show up. Your headline sits invisible while JavaScript churns away. Your users? They’ve already hit the back button. That’s the cost of a slow Largest Contentful Paint, and it’s killing your conversions and search rankings. LCP is one of Google’s Core Web Vitals, which means it directly impacts how Google ranks your website. A slow LCP doesn’t just frustrate users, it actively hurts your SEO.

Read Post

Request Metrics

Read more about Understanding Lighthouse: Largest Contentful Paint

Monitoring microservices and distributed systems with Sentry

Jan 22, 2026 By Richard C. In Sentry

If you’ve ever tried to debug a request that touched five services, a queue, and a database you don’t own, you already know why monitoring distributed systems is hard. Logs live in different places, requests disappear halfway through a flow, and when something breaks in production, you’re reconstructing what happened from fragments. Microservices make this worse by design. A single request fans out across small, independently deployed services, often communicating asynchronously.

Read Post

Sentry

Read more about Monitoring microservices and distributed systems with Sentry

Measuring Claude Code ROI and Adoption in Honeycomb

Jan 22, 2026 By Mae Capozzi In Honeycomb

At Honeycomb, we’ve been using Claude Code across our engineering team for a while. Anecdotally, I had a sense of who the power users were, and I had seen some examples of complex usage. But I wanted to be able to confidently answer questions, like: Claude Code supports OpenTelemetry out of the box, which means sending telemetry to Honeycomb takes just a few minutes of configuration.

Read Post

Honeycomb

Read more about Measuring Claude Code ROI and Adoption in Honeycomb

ChatOps that actually works: Grafana Cloud, Slack, and AI-powered observability

Jan 22, 2026 By Ksenia Yadav In Grafana

Context switching isn’t just inefficient—under pressure, it’s exhausting. It slows decision-making, increases the risk of mistakes, and makes even experienced engineers feel like they’re always a step behind the system they’re responsible for. At Grafana Labs, we want to build tools that meet you where you are. That's why we embedded Grafana Assistant, our context-aware AI assistant, directly in Grafana Cloud.

Read Post

Grafana

Read more about ChatOps that actually works: Grafana Cloud, Slack, and AI-powered observability

React 19 is coming to Grafana: what plugin developers need to know

Jan 22, 2026 By Timur Olzhabayev In Grafana

As part of the upcoming Grafana 13 release in April, we will be updating to React 19, the latest major version of the frontend library for building user interfaces. Grafana uses React as the core technology for its frontend UI and its vibrant ecosystem of plugins. This update ensures we stay aligned with the broader React ecosystem, and allows us to take advantage of ongoing performance enhancements and new functionality provided by React APIs.

Read Post

Grafana

Read more about React 19 is coming to Grafana: what plugin developers need to know

How to Troubleshoot BGP Faster with Kentik AI Advisor

Jan 22, 2026 By Kentik In Kentik

A BGP session goes down because a transit provider exceeded the maximum prefix limit. How do you find the root cause — fast? In this 10-minute demo, we walk through two approaches using Kentik AI Advisor. First, we troubleshoot step by step using natural language: asking AI Advisor to identify the affected interface, check for interface flapping, and review syslog messages until we find the maximum-prefix violation. Then we show how custom network context and natural language runbooks let AI Advisor do the entire investigation autonomously — following the same four steps a senior engineer would.

View Video

Kentik

Read more about How to Troubleshoot BGP Faster with Kentik AI Advisor

Zero Tickets Starts with DEX: Why DEX Data Is Your Missing Ingredient

Jan 22, 2026 By Megan Brake In Nexthink

Every IT leader wants fewer tickets. Many invest in automation, self-service portals, and AI agents to get there. Yet ticket volumes remain stubbornly high, and the service desk stays overloaded. The issue is not the effort or intent. It’s the approach. Most organizations are trying to eliminate tickets without understanding the experience that creates them. They optimize workflows after something breaks but ignore the conditions that cause issues in the first place.

Read Post

Nexthink

Read more about Zero Tickets Starts with DEX: Why DEX Data Is Your Missing Ingredient

2025 Highlights and What's Ahead in 2026 - Customer Brown Bag - January 22nd, 2025

Jan 22, 2026 By Sumo Logic, Inc. In Sumo Logic

Join us as David and Chas walk us through the biggest hits in 2025 -- product updates, Dojo AI Agents -- and what's to come in 2026.

View Video

Sumo Logic

Read more about 2025 Highlights and What's Ahead in 2026 - Customer Brown Bag - January 22nd, 2025

Top Distributed Tracing Tools in 2025: Updated Market Review with Cost Comparison

Jan 22, 2026 By Vaishnavi In Atatus

The distributed tracing landscape has evolved from “observability add-on” to core production infrastructure. In 2026, distributed tracing is no longer optional for engineering teams operating microservices, Kubernetes, or AI-driven workloads. It is now tightly coupled with incident response, cost optimization, and AI-assisted debugging.

Read Post

Atatus

Read more about Top Distributed Tracing Tools in 2025: Updated Market Review with Cost Comparison

The SRE Report 2026: Defensible Ns

Jan 22, 2026 By Leo Vasiliou In Catchpoint

You shouldn’t have to understand the care behind this report, unless it’s missing. For the past eight years, this research has focused on all things related to reliability and resilience. How systems behave under stress. How teams respond when things break. And how the practices continue to evolve. Reaching the eighth edition of The SRE Report attests to that and gives me pause. You can read the full report here and you can find a summary of the key findings here.

Read Post

Catchpoint

Read more about The SRE Report 2026: Defensible Ns

SRE Report 2026: What surprised us, what didn't, and why the gaps matter most

Jan 22, 2026 By Denton Chikura In Catchpoint

This is the eighth edition of the SRE Report. Eight years of tracing reliability's arc, from uptime obsession to experience, from toil to intelligence, from systems to people. This year's report is also the first since Catchpoint joined LogicMonitor. We want to acknowledge their support in keeping this work going. They get what this report means to the reliability community, and that matters. We made a deliberate choice this year to say less.

Read Post

Catchpoint

Read more about SRE Report 2026: What surprised us, what didn't, and why the gaps matter most

From Monitoring Signals to Observability Maturity

Jan 22, 2026 By Allyson Boate In InfluxData

Efficient monitoring delivers fast results: alerts fire within seconds, dashboards refresh continuously, and teams know the moment something changes. Understanding arrives later. An alert may show that a value shifted, but it does not explain why it shifted, how far the impact will spread, or which components truly matter. Teams see the signal, not the system behavior behind it. This gap defines the limit of traditional monitoring. Detection has improved, but explanation has not kept pace.

Read Post

InfluxData

Read more about From Monitoring Signals to Observability Maturity

API Health Monitoring Explained: How to Detect Silent Failures That Health Checks Miss

Jan 22, 2026 By Dotcom-Monitor In Dotcom-Monitor

APIs sit at the center of modern digital systems. They power mobile apps, enable partner integrations, and connect internal services across distributed architectures. When an API fails, the impact is immediate: broken user journeys, stalled transactions, and downstream systems that quietly stop working. That’s why API health monitoring is now a core reliability practice for modern engineering teams. The problem is that “API health” is often defined too narrowly.

Read Post

Dotcom-Monitor

Read more about API Health Monitoring Explained: How to Detect Silent Failures That Health Checks Miss

Observability for GenAI Applications (Grafana OpenTelemetry Community Call)

Jan 22, 2026 By Grafana In Grafana

In this episode, we’re diving into observability for Generative AI apps. AI helps us write code and monitor applications in production - but how do we observe the AI itself? And how do we make sense of complex, non-deterministic AI systems? We’re joined by two great guests: Ishan Jain, working on GenAI observability and Luccas Quadros, working on Grafana Assistant. Together, they bring both platform-level insights and real-world perspectives.

View Video

Grafana

Read more about Observability for GenAI Applications (Grafana OpenTelemetry Community Call)

How to Scan for IP Address on a Network? - Ultimate Guide & 6 Best IP Scanners

Jan 22, 2026 By Staff Contributor In SolarWinds

Amid predictions that 39.42 billion devices will have internet connectivity by 2030, IP address management has become a fundamental housekeeping and security concern for any networking admin. As the Internet of Things (IoT) continues to endow more and more devices with smart capabilities, networking grows more complex, making IP-centered network security measures a business imperative.

Read Post

SolarWinds

Read more about How to Scan for IP Address on a Network? - Ultimate Guide & 6 Best IP Scanners

How to Use the Secure Vault in Uptime.com

Jan 22, 2026 By Uptime Website Monitoring In uptime

In this tutorial, we explore Uptime.com's Secure Vault and how to securely create, edit, and manage your credentials. Learn how to access the Vault, add new Vault Items including Username/Password pairs, Certificates, Single Secret Tokens, and Time-based One-Time Passwords (TOTP), and use them in HTTP(S), API, Transaction, and Page Speed checks. Discover enhanced security features, including 256-bit AES-GCM encryption and zero-trust credential storage. We also cover REST API integration, variable usage, and user permissions.

View Video

uptime

Read more about How to Use the Secure Vault in Uptime.com

Sponsored Post

Monitoring MongoDB

Jan 21, 2026 By NiCE IT Mgmt In NiCE IT Mgmt

As enterprises increasingly rely on MongoDB to power modern applications, ensuring the database's performance, availability, and reliability has become critical. MongoDB's distributed architecture and dynamic workloads provide flexibility and scalability, but they also introduce monitoring challenges that can impact application performance and business continuity.

Read Post

NiCE IT Mgmt

Read more about Monitoring MongoDB

Key Financial Services Industry Trends Shaping 2026

Jan 21, 2026 By Rebecca Grassing In Auvik

The financial services industry is continuing its acceleration. AI is rolling out across the enterprise, and compliance expectations continue to diverge based on jurisdiction. It’s an unprecedented technology shift to say the least, and the pressure is being felt throughout the IT industry to catch up and remain resilient. More important now than ever before, learn how Auvik provides financial institutions with full network visibility and monitoring that catches problems before they become outages.

Read Post

Auvik

Read more about Key Financial Services Industry Trends Shaping 2026

Testing Icinga in a Homelab Setup With Nextcloud

Jan 21, 2026 By Jolien Trog In Icinga

If you want to get started with Icinga but don’t have a data center lying around, no worries. Icinga is a lightweight monitoring tool that works for both large infrastructures and small home labs. When I first explored Icinga during my first year as an apprentice, it was also my first real contact with monitoring tools. After completing the Icinga Fundamentals training, I wanted to experiment with hosts and services, but what should I monitor?

Read Post

Icinga

Read more about Testing Icinga in a Homelab Setup With Nextcloud

Setup Sentry in Next.js | Debugging Next.js Applications with Sentry

Jan 21, 2026 By Sentry In Sentry

Setting up Sentry is literally one CLI command. This video walks you through how to create a new Next.js project on Sentry, how to install Sentry into that project, and how to trigger and monitor your first error.

View Video

Sentry

Monitoring

Read more about Setup Sentry in Next.js | Debugging Next.js Applications with Sentry

Sourcemaps | Debugging Next.js Applications with Sentry

Jan 21, 2026 By Sentry In Sentry

Configuring source maps for your Next.js application with a Sentry project gives you detailed information about where performance issues and errors are triggered in your code, making it faster to debug and resolve issues.

View Video

Sentry

Monitoring

Read more about Sourcemaps | Debugging Next.js Applications with Sentry

Easily Map Logs to OCSF with Datadog Observability Pipelines

Jan 21, 2026 By Datadog In Datadog

Normalizing security logs into the Open Cybersecurity Schema Framework (OCSF) is often complex, manual, and time-consuming. With Datadog Observability Pipelines, you can easily transform logs into OCSF format—right in your own environment—before routing them to destinations like Splunk, CrowdStrike, and AWS Security Lake. This video show how Security teams can use Observability Pipelines to: Collect, process, and transform logs into OCSF format automatically.

View Video

Datadog

Read more about Easily Map Logs to OCSF with Datadog Observability Pipelines

Reducing Alert Noise with Composite Alerts in Hosted Graphite

Jan 21, 2026 By Benjamin Pitts In MetricFire

Traditional alerts are simple by design: if a metric crosses a threshold, fire an alert. While that simplicity makes alerts easy to configure, it also leads to alert noise, because single metrics rarely tell the full story and often trigger during non-actionable conditions. Hosted Graphite Composite Alerts solve this by allowing you to combine multiple alert conditions using logical expressions like AND (&&) and OR (||).

Read Post

MetricFire

Read more about Reducing Alert Noise with Composite Alerts in Hosted Graphite

Green dashboards, red flags

Jan 21, 2026 By Milin Desai In Sentry

A VP of Engineering (from a company I’m not allowed to name) told me recently: "You helped us find and fix real user-facing issues. Now we need to convince our CTO why that matters more than the standard SLO’s and systems." Here's the thing: your CTO is not wrong in measuring the systems and basic uptime. That’s the baseline though. They’re all trying to watch everything, but they’re seeing nothing as it relates to users.

Read Post

Sentry

Read more about Green dashboards, red flags

Why AI Automation for ITOps Needs Context Graphs

Jan 21, 2026 By Margo Poda In LogicMonitor

AI automation in ITOps fails because execution loses decision context, and context graphs turn incident history into durable execution memory that systems can actually reuse. AI automation for ITOps fails because it remembers what it did, but not why. Fixing an issue depends on what was tried last time, what failed, what worked, which exceptions were approved, and under what conditions. That information rarely lives in the system.

Read Post

LogicMonitor

Read more about Why AI Automation for ITOps Needs Context Graphs

What is HEAL Monitoring Tool? A Comprehensive Guide for IT Leaders

Jan 21, 2026 By Renuka Suresh In HEAL Software

Your organization has invested heavily in monitoring tools for application performance, infrastructure monitoring tools for servers and databases, log monitoring tools, network monitoring tools, and third-party monitoring tools for specific services. But the actual problem is your IT team is drowning in that data. A single production issue generates 30+ alerts across applications, databases, servers, and monitoring tools, creating an alert flood that buries the actual problem.

Read Post

HEAL Software

Read more about What is HEAL Monitoring Tool? A Comprehensive Guide for IT Leaders

When Things Go Wrong, Systems Should Help Humans - Not Fight Them

Jan 21, 2026 By James Barnes In StatusCake

In the previous post, we explored how AI accelerates delivery and compresses the time between change and user impact. As velocity increases, knowing that something has gone wrong before users do becomes a critical capability. But detection is only the beginning. Once alerts fire and dashboards light up, humans still have to interpret what’s happening, make decisions under pressure, and act.

Read Post

StatusCake

Read more about When Things Go Wrong, Systems Should Help Humans - Not Fight Them

Why Visibility Into Work Patterns Is the Real Competitive Edge for Remote Teams

Jan 21, 2026 By OpsMatters In OpsMatters

A remote day slips off track when work shifts in ways no one can see. Tasks move, pause, or double back without a clear signal, and the slowdown hits the team before anyone can trace where the drift began. This article explores how visibility into daily work patterns becomes the edge that keeps remote teams steady. Remote computer monitoring software helps you read those patterns earlier and act with precision.

Read Post

OpsMatters

Read more about Why Visibility Into Work Patterns Is the Real Competitive Edge for Remote Teams

Cloud Provider Status Report - December 2025

Jan 20, 2026 By Nuno Tomas In isDown

This report presents incident data from major cloud providers for December 2025, covering AWS, Azure DevOps, DigitalOcean, Fly.io, Heroku, Linode, Netlify, Railway, Render, and Vercel. The data includes both officially reported incidents from provider status pages and unconfirmed incidents detected by IsDown's monitoring system.

Read Post

isDown

Read more about Cloud Provider Status Report - December 2025

New API feature: Status-based filtering for monitors and components

Jan 20, 2026 By Valeria Kurolapova In StatusGator

We’ve added a new enhancement to the StatusGator API that makes it faster and easier to identify service issues in real time: status-based filtering for monitors and components.

Read Post

StatusGator

Read more about New API feature: Status-based filtering for monitors and components

Event Intelligence Solutions - A New Era for IT Operations

Jan 20, 2026 By david.arrowsmith In Interlink

In an era where digital performance defines business success, large enterprises are embracing Event Intelligence Solutions (EIS) to keep services available, resilient, customer-facing operations protected from disruption. According to Gartner, Event Intelligence Solutions use AI and advanced analytics to enhance and automate how organizations respond to signals generated by digital services.

Read Post

Interlink

Read more about Event Intelligence Solutions - A New Era for IT Operations

Taking Server Monitoring to the Next Level

Jan 20, 2026 By Babu Sundaram In eG Innovations

For many years, uptime and availability have been basic standard measures of server health monitoring. But if a server is up and responding to a ping or HTTP request, does that really mean that all is well? In reality, uptime and availability alone often provide a false sense of security. A server can be technically “up” while being seconds away from a crash, running out of memory, operating with an expired license, or silently failing critical updates.

Read Post

eG Innovations

Read more about Taking Server Monitoring to the Next Level

A New Way to Debug Query Performance in Cloud Dedicated

Jan 20, 2026 By Reid Kaufmann In InfluxData

I’d like to share a new influxctl ease-of-use feature in v2.12.0 that makes it easier to optimize important queries or debug slow ones. influxctl has had the capability to send queries and display the results in JSON or tabular formats for some time.

Read Post

InfluxData

Read more about A New Way to Debug Query Performance in Cloud Dedicated

Spark: An IT Agent for Every Employee

Jan 20, 2026 By Pedro Bados In Nexthink

It’s no secret that all software and more broadly, any technology that doesn’t move atoms is ripe for disruption by the current and future capabilities of large language models. Any workflow, application, or digital process that can be expressed in code can be redesigned, improved, and transformed at speed and scale. AI-first companies will outpace legacy players by orders of magnitude, and many workflow-based models with humans in the loop will be fundamentally reshaped.

Read Post

Nexthink

Read more about Spark: An IT Agent for Every Employee

Vercel Logs on Sentry

Jan 20, 2026 By Sentry In Sentry

Set up a Log Drain from Vercel and monitor all of your application data in Sentry. We'll take a quick look at how to set up a Log Drain, and show you two quick examples of how they can be used.

View Video

Sentry

Monitoring

Read more about Vercel Logs on Sentry

Monitor Arista VeloCloud SD-WAN performance with Datadog

Jan 20, 2026 By David Pointeau In Datadog

As organizations grow their cloud environments and branch office networks, maintaining reliable connectivity and application performance becomes more complex. VeloCloud SD-WAN provides dynamic, policy-based routing to help ensure that your connectivity is dependable and cost-efficient, and that your applications perform consistently.

Read Post

Datadog

Read more about Monitor Arista VeloCloud SD-WAN performance with Datadog

Full Circles: DEX in the Age of Agentic AI featuring Christy Punch (Forrester)

Jan 20, 2026 By Nexthink In Nexthink

In a full-circle moment for The DEX Show, Tom and Tim welcome guest speaker, Forrester’s new Digital Workplace & DEX analyst, Christy Punch, for her first podcast in the role—echoing the show’s very first Forrester guest back in 2020. The timing is bittersweet: it’s also Tim Flower’s final month as co-host, marking a major transition for the podcast.

View Video

Nexthink

Read more about Full Circles: DEX in the Age of Agentic AI featuring Christy Punch (Forrester)

Try SolarWinds Observability Today

Jan 20, 2026 By solarwindsinc In SolarWinds

When every second counts, your IT systems can’t afford blind spots. SolarWinds Observability delivers AI-powered, contextual awareness to help IT teams keep critical services running no matter the complexity. Connect the dots across networks, applications, cloud environments, and physical infrastructure with one comprehensive observability platform. With intelligent insights and real-time visibility, SolarWinds helps you prevent downtime, troubleshoot faster, and resolve issues before they impact users even in the most demanding environments.

View Video

SolarWinds

Read more about Try SolarWinds Observability Today

CloudOps Revolution: Redefining Saas Operations

Jan 20, 2026 By Faith Shulman In Digitate

Picture a small B2B software company in the early 2010s. The company grew in the 2000s to a few thousand customers by offering on-prem software to small and medium businesses. Senior leaders had recently decided to offer their software over the web via servers rented from AWS.

Read Post

Digitate

Read more about CloudOps Revolution: Redefining Saas Operations

Shadow AI Is the Next Big IT Risk | SolarWinds TechPod #105

Jan 20, 2026 By solarwindsinc In SolarWinds

Shadow AI and “vibe apps” allow employees to build powerful automations without governance. SolarWinds experts explain why this trend could explode in 2026 — and what IT leaders should watch for.

View Video

SolarWinds

Read more about Shadow AI Is the Next Big IT Risk | SolarWinds TechPod #105

Multi-Tenant Network Monitoring for MSPs

Jan 20, 2026 By Andrii Kernitskyi In Obkio

Managing 50 client networks means 50 separate monitoring instances, 50 sets of credentials, and 50 different dashboards to check daily. Every morning starts with logging into multiple platforms, context switching between interfaces, and hoping you didn't miss a critical alert buried somewhere. Traditional network monitoring tools weren't exactly built for MSPs. They're designed for single organizations monitoring their own infrastructure, which means every client you onboard adds exponential complexity.

Read Post

Obkio

Read more about Multi-Tenant Network Monitoring for MSPs

Getting Started with the InfluxDB 3 Explorer UI

Jan 20, 2026 By InfluxData In InfluxData

In this video, we walk you through setting up and getting started with the InfluxDB Explorer UI - a convenient, easy way to query and explore your data. This walks you through installation and connecting with the Explorer UI, as well as loading data, running queries, and setting up a basic dashboard.

View Video

InfluxData

Read more about Getting Started with the InfluxDB 3 Explorer UI

AI SRE Update: Your Feedback Shaped Our Latest Release

Jan 20, 2026 By Mezmo In Mezmo

A note from Lauren Nagel, Mezmo's VP of Product: At Mezmo, we believe the best observability tools aren't just built for users, they're built with them. Since the launch of Mezmo's AI SRE agent, we've listened and learned from our customers. The feedback and insights have been invaluable in helping our teams refine and enhance the experience. Today, we're excited to share our latest release, packed with improvements and powerful new capabilities that make our AI SRE even faster and more intuitive.

Read Post

Mezmo

Read more about AI SRE Update: Your Feedback Shaped Our Latest Release

Telemetry Talks - Ep.1 - Observability and OpenTelemetry

Jan 20, 2026 By VictoriaMetrics In VictoriaMetrics

In the first episode of Telemetry Talks, Diana talks with Jose, VictoriaMetrics Cloud Lead, about the practical origins of observability and how OpenTelemetry is shaping modern monitoring. They cover why observability became critical as systems moved from monoliths to microservices, how OpenTelemetry unifies traces, metrics, and logs while avoiding vendor lock-in, and how it integrates natively with VictoriaMetrics.

View Video

VictoriaMetrics

Monitoring

Read more about Telemetry Talks - Ep.1 - Observability and OpenTelemetry

Introducing The First Graylog Helm Chart Beta V1.0.0

Jan 20, 2026 By Jeff Darrington In Graylog

Running Graylog on Kubernetes has been possible for a while, but let’s be honest: it usually involved a fair amount of DIY. Custom manifests, duct-taped values files, and more than one late-night kubectl describe pod. That changes today. We’re releasing the first-ever Graylog Helm chart for Kubernetes — now available in beta.

Read Post

Graylog

Read more about Introducing The First Graylog Helm Chart Beta V1.0.0

Why IT Leaders Are Consolidating Observability Tools in 2026

Jan 20, 2026 By Sofia Burton In LogicMonitor

Consolidation unifies your observability stack, readies it for AI, and paves the path to autonomous IT. Many IT leaders consider consolidation because of cost pressure or rising vendor spend. But the real challenge goes deeper. IT environments have become more complex, distributed, and noisy, making it difficult for fragmented tools to keep up.

Read Post

LogicMonitor

Read more about Why IT Leaders Are Consolidating Observability Tools in 2026

Top tips: Designing systems people won't work around

Jan 19, 2026 By Shawn King Jason In ManageEngine

Top tips is a weekly column where we highlight what’s trending in the tech world today and list ways to explore these trends. This week, we’re looking at why people bypass systems—and how better design choices can prevent it. When people work around systems, it’s tempting to blame their behavior. In reality, most employee workarounds are signals.

Read Post

ManageEngine

Read more about Top tips: Designing systems people won't work around

Organize your monitors with groups

Jan 19, 2026 By Valeria Kurolapova In StatusGator

This is one of our most requested features – and it’s finally here. Many of you told us that as your monitoring setup grows, it becomes harder to manage long lists of services and harder for users to quickly understand what’s actually affected during an incident. Monitor groups were built to solve exactly that. Now you can organize related monitors together and present a clearer, more structured view of system health everywhere StatusGator is used.

Read Post

StatusGator

Read more about Organize your monitors with groups

High Cardinality Metrics: How Prometheus and ClickHouse Handle Scale

Jan 19, 2026 By Aditya Godbole , In Last9

TL;DR: Prometheus pays cardinality costs at write time (memory, index). ClickHouse pays at query time (aggregation memory). Neither is "better":they fail differently. Design your pipeline knowing which failure mode you're accepting. -- Every month, someone posts "just use ClickHouse for metrics" or "Prometheus can't handle scale." Both statements contain a kernel of truth wrapped in dangerous oversimplification.

Read Post

Last9

Read more about High Cardinality Metrics: How Prometheus and ClickHouse Handle Scale

Observability with AI? Honeycomb with AI!

Jan 19, 2026 By Jessica Kerr (Jessitron) In Honeycomb

Since Honeycomb started, it has had a weakness: too many choices. Every field, custom or standard, hundreds of them, all are free to group, filter, and visualize in dozens of ways. Which ones are interesting? Honeycomb exists to help people understand custom software. It doesn’t pretend to know what matters in your application. That’s an interpretive task, not programmatic. Hey, computers can do interpretation now!

Read Post

Honeycomb

Read more about Observability with AI? Honeycomb with AI!

VirtualMetric's Hybrid Security Data Collection Architecture: Performance and Scale Without Compromise

Jan 19, 2026 By VirtualMetric In VirtualMetric

Modern security operations face a growing architectural challenge: collect telemetry from everywhere, process it in real time, and route it to multiple platforms while maintaining data sovereignty, avoiding agent sprawl, and keeping costs under control. Single-model collection strategies force security teams to make compromises. Agent-only models create operational overhead and maintenance risk. Agentless-only approaches simplify operations but limit depth and flexibility.

Read Post

VirtualMetric

Read more about VirtualMetric's Hybrid Security Data Collection Architecture: Performance and Scale Without Compromise

Lightrun Runtime Context MCP | Lightrun

Jan 19, 2026 By Lightrun In Lightrun

In this video, Lightrun's Moshe Sambol walks you through the power of Lightrun MCP and Runtime Context. A game-changer for AI-assisted development. This integration lets developers debug live issues, inspect real-world variables, and verify fixes across environments, all without leaving the IDE. With Lightrun MCP, you can: Capture live transaction state directly from Staging and Production. Identify root causes using real runtime values, not just static code. Verify fixes instantly without redeploying or context switching.

View Video

Lightrun

Read more about Lightrun Runtime Context MCP | Lightrun

Most Popular Java Web Frameworks in 2026

Jan 18, 2026 By Vince Power In Rollbar

Look, if you're starting a new Java web project in 2026, you should probably just use Spring Boot. With 14.7% usage in the 2025 Stack Overflow Developer Survey and a 53.7% admiration score among all web frameworks, it remains the default choice for modern Java web development. It has the largest ecosystem, best documentation, most active community, and strongest cloud-native support—now enhanced with built-in AI capabilities through Spring AI.

Read Post

Rollbar

Read more about Most Popular Java Web Frameworks in 2026

EPISODE 5 - The AI FOMO Cure

Jan 18, 2026 By Digitate In Digitate

Tom Stoneman and Digitate Field CTO Efrain Ruh discuss real-world AI adoption roadblocks. They cover the three pillars of success (integration, observability, transparency) and how fear stalls progress.

View Video

Digitate

Read more about EPISODE 5 - The AI FOMO Cure

Major outage takes down X and Grok

Jan 17, 2026 By Colin Bartlett In StatusGator

On January 16, 2026 the social media platform X (formerly known as Twitter) and its AI chatbot, Grok, experienced a widespread outage affecting users around the world. This incident underscores why proactive outage detection matters. StatusGator’s Early Warning Signals spotted meaningful signs of disruption long before any official provider acknowledgment appeared publicly and helped organizations prepare or respond faster than waiting for status pages or press releases.

Read Post

StatusGator

Read more about Major outage takes down X and Grok

New API endpoints: Pause and resume website & ping monitors

Jan 16, 2026 By Valeria Kurolapova In StatusGator

We’ve added new API capabilities that give you more control over your monitoring workflows – directly from code. You can now pause and resume website and ping monitors via the StatusGator API, exposing the same pause functionality that’s available in the UI.

Read Post

StatusGator

Read more about New API endpoints: Pause and resume website & ping monitors

New: Organization security and privacy controls

Jan 16, 2026 By Valeria Kurolapova In StatusGator

We’ve added new organization-level security and privacy controls to give admins more control over how their StatusGator account is accessed and how data is used. You can now manage these settings from Organization settings → Security options.

Read Post

StatusGator

Read more about New: Organization security and privacy controls

Microsoft SCOM Cheat Sheet

Jan 16, 2026 By NiCE IT Mgmt In NiCE IT Mgmt

Everything you need to know to run SCOM like a professional View & Download Whitepaper.

Read Post

NiCE IT Mgmt

Read more about Microsoft SCOM Cheat Sheet

Verizon outage - January 14

Jan 16, 2026 By Andy Libby In StatusGator

When a major carrier like Verizon goes down, the impact is immediate and widespread. On January 14, 2026, thousands of users across the United States found themselves without cellular service, unable to make calls, send texts, or access data. While social media erupted with reports of “SOS mode” on iPhones, official acknowledgment from the provider lagged behind for hours.

Read Post

StatusGator

Read more about Verizon outage - January 14

Datadog vs. New Relic: 2026 Comparison

Jan 16, 2026 By Aiswarya S In Atatus

If you're working in IT monitoring and observability, you simply cannot ignore the power of Datadog and New Relic. These two tools have plenty of features that can revolutionize your entire observability strategy and give you complete control over your infrastructure. These tools are built so as to capture the tiniest of details, be it on applications, infrastructure, databases, servers, or something completely on the cloud.

Read Post

Atatus

Read more about Datadog vs. New Relic: 2026 Comparison

Why Today's ITOps Workflows Break When Systems Get Too Big

Jan 16, 2026 By Margo Poda In LogicMonitor

Modern, hybrid environments change continuously. But, legacy ITOps workflows assume stable infrastructure. IT environments don’t behave in predictable ways. Infrastructure changes continuously, services spin up and shut down on demand, and data formats evolve with every deployment. Most ITOps workflows, however, are still designed around the assumption of stability. That mismatch drives failure. Static runbooks expect environments to stay put.

Read Post

LogicMonitor

Read more about Why Today's ITOps Workflows Break When Systems Get Too Big

Easy Guide for Connecting VictoriaMetrics to a Grafana Data Source

Jan 16, 2026 By Benjamin Pitts In MetricFire

VictoriaMetrics is a fast, cost-efficient, and highly scalable time-series database designed as a drop-in replacement for Prometheus storage. It is widely used for collecting, storing, and querying metrics at scale, while remaining lightweight enough to run as a single binary or container. Because it is fully Prometheus-compatible, VictoriaMetrics supports standard PromQL queries and integrates seamlessly with Grafana.

Read Post

MetricFire

Read more about Easy Guide for Connecting VictoriaMetrics to a Grafana Data Source

Elevating global operations: Mastering multi-cluster Elastic deployments with Fleet

Jan 16, 2026 By Nima Rezainia In Elastic

In today's global enterprises, distributed infrastructure is the norm, not the exception. Organizations operate across continents and are driven by customer proximity and regulatory requirements. For the Elastic Stack, this reality often translates into a multi-cluster deployment model, where data is collected and stored in multiple geographically dispersed Elasticsearch clusters. But, why adopt complexity? The decision to decentralize data storage is generally driven by three critical factors.

Read Post

Elastic

Read more about Elevating global operations: Mastering multi-cluster Elastic deployments with Fleet

What Are the Pilllars of Observability?

Jan 16, 2026 By Coroot In Coroot

Understand the four pillars of observability (metrics, logs, traces, and profiles) with Co-founder Peter Zaitsev.

View Video

Coroot

Read more about What Are the Pilllars of Observability?

5 Best Error Monitoring Tools to Use in 2026

Jan 16, 2026 By Rollbar Editorial Team In Rollbar

The best tool to track, analyze, and manage errors at scale? Rollbar tops our list.

Read Post

Rollbar

Read more about 5 Best Error Monitoring Tools to Use in 2026

Building reliable dashboard agents with Datadog LLM Observability

Jan 16, 2026 By Harmit Minhas In Datadog

This article is part of our series on how Datadog’s engineering teams use LLM Observability to iterate, evaluate, and ship AI-powered agents. In this first story, the Graphing AI team shares how they instrumented their widget- and dashboard-generation agents with LLM Observability to detect regressions and debug failures faster. Visibility into how large language model (LLM) applications behave in real time is essential for building reliable AI-driven systems at Datadog.

Read Post

Datadog

Read more about Building reliable dashboard agents with Datadog LLM Observability

Sponsored Post

EventSentry v6: Azure Logs, HEC, Sigma, Log Signing & More

Jan 15, 2026 By ingmar.koecher In EventSentry

Even though the shift to the cloud has slowed recently as many businesses are moving certain workloads back on-premise, Microsoft Exchange remains one cloud-based service that most organizations continue to embrace – despite its frequent outages. This doesn’t come as a surprise, as Microsoft has successfully devolved on-prem Exchange Server – the only viable alternative – into an unfriendly dragon that even experienced sysadmins won’t touch with a 10 ft pole.

Read Post

EventSentry

Read more about EventSentry v6: Azure Logs, HEC, Sigma, Log Signing & More

Observability Pricing Models: How to Evaluate Cost, Value, and Predictability

Jan 15, 2026 By Kristy Slimmer In Galileo

Observability pricing often seems reasonable at the outset, but many organizations discover their real complexity only as environments scale and usage patterns change. As environments grow more complex and hybrid by default, many organizations struggle with rising costs, fragmented tools, and pricing models that complicate cost predictability and long-term planning.

Read Post

Galileo

Read more about Observability Pricing Models: How to Evaluate Cost, Value, and Predictability

Time Series Meets Graph: Understanding Relationships in Streaming Data

Jan 15, 2026 By Allyson Boate In InfluxData

Data systems rarely operate as isolated components. Machines depend on sensors, services rely on other services, and devices exchange data through shared gateways. When something changes, the impact often spreads beyond a single metric. To trace how changes move through complex systems, many teams turn to graph-style analysis to map dependencies and follow cause and effect.

Read Post

InfluxData

Read more about Time Series Meets Graph: Understanding Relationships in Streaming Data

Office Hours with David Girvin

Jan 15, 2026 By Sumo Logic, Inc. In Sumo Logic

Weekly office hours with David Girvin. Check out recent feature releases and updates, watch a quick live demo, and ask any questions with live Q&A.

View Video

Sumo Logic

Read more about Office Hours with David Girvin

Understanding Lighthouse: First Contentful Paint

Jan 15, 2026 By Todd H. Gardner In Request Metrics

You ran Lighthouse and got flagged for a slow First Contentful Paint. The score is orange (or worse, red), and the help text says something about “the time at which the first text or image is painted.” But what does that actually mean for your users? And why should you care?

Read Post

Request Metrics

Read more about Understanding Lighthouse: First Contentful Paint

Lightrun MCP: AI agents now validate your code with live Runtime Context

Jan 15, 2026 By Lightrun In Lightrun

Lightrun R&D Team Lead Or Galon and Engineer Roy Chen demo how you the new Lightrun MCP allows AI coding assistants to access Runtime Context, and validate how software will behave in production.

View Video

Lightrun

Read more about Lightrun MCP: AI agents now validate your code with live Runtime Context

The Time for Cost-Intelligent Observability is Now

Jan 15, 2026 By Teia Jensen In LogicMonitor

Cost-Intelligent Observability shifts your organization into a proactive and collaborative cloud cost management powerhouse.

Read Post

LogicMonitor

Read more about The Time for Cost-Intelligent Observability is Now

"You Had One Job": Why Twenty Years of DevOps Has Failed to Do it

Jan 15, 2026 By Charity Majors In Honeycomb

Let’s start with a question. What is DevOps all about? I’ll tell you my answer. In retrospect, I think the entire DevOps movement was a mighty, twenty year battle to achieve one thing: a single feedback loop connecting devs with prod. On those grounds, it failed. Not because software engineers weren’t good at their jobs, or didn’t care enough. It failed because the technology wasn’t good enough.

Read Post

Honeycomb

Read more about "You Had One Job": Why Twenty Years of DevOps Has Failed to Do it

Dossinth AI - where we're at

Jan 15, 2026 By Lucian Daniliuc In Monitive

The AI sidekick called Dossinth I'm building is starting to take shape. So far, it became a moderately useful CRM, with a nice dashboard, and event log, daily briefing (which I am also receiving by email), account management and knowledge base. It's less AI and more CRM. For now. Here's a quick tour...

Read Post

Monitive

Read more about Dossinth AI - where we're at

What is Runtime Context? A Practical Definition for the AI Era

Jan 15, 2026 By Roy Chen In Lightrun

TLDR: Runtime Context is live, execution-level access to a running production system. It lets engineers and AI agents ask precise questions of running code and get answers immediately, without redeploying or interrupting users. This is the new baseline for reliability.

Read Post

Lightrun

Read more about What is Runtime Context? A Practical Definition for the AI Era

Fleet Management and Terraform: Use cases and best practices for managing collectors in Grafana Cloud

Jan 15, 2026 By Kristie Grebe In Grafana

Earlier this year we launched Grafana Cloud Fleet Management to address the pain that comes with managing scores of telemetry collectors across departments and environments. We've been excited to see how organizations are using it to manage collectors at scale, but we've also heard from users who aren't sure how Fleet Management fits with their existing infrastructure-as-code tooling. The good news is Fleet Management is designed specifically to complement—not replace—tools like Terraform.

Read Post

Grafana

Read more about Fleet Management and Terraform: Use cases and best practices for managing collectors in Grafana Cloud

Paginating large datasets in production: Why OFFSET fails and cursors win

Jan 15, 2026 By Lazar Nikolov In Sentry

The things that separate an MVP from a production-ready app are polish, final touches, and the Pareto ‘last 20%’ of work. Many of the bugs, edge cases, and performance issues will come to the surface after you launch, when the user stampede puts a serious strain on your application. If you’re reading this, you’re probably sitting on the 80% mark, ready to tackle the rest.

Read Post

Sentry

Read more about Paginating large datasets in production: Why OFFSET fails and cursors win

Agentless First, Agents When Needed: A Hybrid Approach to Security Telemetry

Jan 15, 2026 By VirtualMetric In VirtualMetric

Security data collection has become a first-class architectural concern for modern SOCs. Once collection is treated as a dedicated layer, separate from analytics and detection, the next question becomes practical: how should telemetry be collected in a way that aligns with this architecture? In the previous article, we examined why this shift occurred. Here, we focus on how different collection models (agent-based, agentless, and hybrid) fit into modern security data collection architectures.

Read Post

VirtualMetric

Read more about Agentless First, Agents When Needed: A Hybrid Approach to Security Telemetry

The fragile web: 2025's lessons on uptime, reality, and engineering rigor

Jan 14, 2026 By Ramkumar Ramaswamy In Site24x7

If you are into IT operations or leadership, you likely spent at least one weekend in 2025 huddled over a laptop while the rest of the world slept. For the last decade, our industry has pursued five nines (99.999% uptime) as the holy grail. We architected redundant systems, deployed across multiple availability zones, and optimized our code until it hummed. We convinced ourselves that if we just engineered hard enough, we could tame the chaos of the internet. We thought we could. We really did.

Read Post

Site24x7

Read more about The fragile web: 2025's lessons on uptime, reality, and engineering rigor

Simplify the Collection Layer and Move to OTel Without the Agent Sprawl

Jan 14, 2026 By Mezmo In Mezmo

This is blog 2 in our New Year, New Resolution Series on OTel migrations. Read the first post, "New Year, New Telemetry: Resolve to Stop Breaking Dashboards", here. Most New Year’s resolutions fail because they require a "big bang" change. If your 2026 mandate is to migrate to OpenTelemetry (OTel), the traditional approach is the definition of friction.

Read Post

Mezmo

Read more about Simplify the Collection Layer and Move to OTel Without the Agent Sprawl

VictoriaLogs Basics: What You Need to Know, with Examples & Visuals

Jan 14, 2026 By Phuong Le In VictoriaMetrics

This post covers ~80% of VictoriaLogs concepts and features people ask about most often. For deeper details and full references, see the official VictoriaLogs documentation. Part of a series.

Read Post

VictoriaMetrics

Read more about VictoriaLogs Basics: What You Need to Know, with Examples & Visuals

Cribl Search Pack for Outlook Email Activity

Jan 14, 2026 By Cribl In Cribl

Email is still mission-critical, but most teams have very little visibility into what’s actually happening behind the scenes. In this video, I give a quick walkthrough of an inbox intelligence dashboard built on Cribl Search. It shows email volume, delivery health, and unusual activity at a glance, without digging through raw logs unless of course you like doing that.

View Video

Cribl

Read more about Cribl Search Pack for Outlook Email Activity

Logging in React Native with Sentry

Jan 14, 2026 By Lewis D. In Sentry

Logs are often the first place dev teams look when they investigate an issue. But logs are often added as an afterthought, and developers struggle with the balance of logging too much or too little. As a seasoned developer, you may remember a time when you were asked to investigate an issue and then handed a 200 MB plaintext log file. Three hours and four Python scripts later, you would realize that the problem was in a different component.

Read Post

Sentry

Read more about Logging in React Native with Sentry

OpenTelemetry and Grafana Labs: What's new and what's next in 2026

Jan 14, 2026 By Marylia Gutierrez In Grafana

For many teams, 2024 was the year of asking, “can OpenTelemetry do this?” In 2025, the community answered with a resounding “yes,” moving beyond experimentation to focus on what matters most in practice: stability, ease of use, and cross-project compatibility. That momentum now sets the stage for what’s to come for OpenTelemetry in 2026.

Read Post

Grafana

Read more about OpenTelemetry and Grafana Labs: What's new and what's next in 2026

A Day in the Life of ITOps: Why Manual Ops Can't Scale Without AI Automation

Jan 14, 2026 By Margo Poda In LogicMonitor

A typical ITOps day is consumed by manual triage, fragmented context, and coordination work that expands with scale and slows every incident. Your day begins with alerts that arrived overnight. The symptoms are partial and the blast radius is unclear, so the first task is not remediation; it is figuring out what is real, what is related, and what matters. Next, a ticket comes in with a brief description and no evidence. Ownership is unclear.

Read Post

LogicMonitor

Read more about A Day in the Life of ITOps: Why Manual Ops Can't Scale Without AI Automation

The Ultimate Guide to Error Monitoring: Why Error Monitoring Matters More Than Ever in 2026

Jan 14, 2026 By Sarah Morgan In Scout

Errors get a bad rap, but they’re just trying to help. Remember, errors aren’t the enemy, they’re the messenger. Conventional wisdom tells you to think of errors as failures, as things that thwart progress and frustrate developers. The reality is that errors are actually there to help you. They prevent you from shipping broken code to production. They stop your application from continuing to operate incorrectly and costing you money.

Read Post

Scout

Read more about The Ultimate Guide to Error Monitoring: Why Error Monitoring Matters More Than Ever in 2026

When AI Speeds Up Change, Knowing First Becomes the Constraint

Jan 14, 2026 By James Barnes In StatusCake

In a recent post, I argued that AI doesn’t fix weak engineering processes; rather it amplifies them. Strong review practices, clear ownership, and solid fundamentals still matter just as much when code is AI-assisted as when it’s not. That post sparked a follow-up question in the comments that’s worth sitting with: With AI speeding things up, how do teams realise something’s gone wrong before users do? It’s the right question to ask next.

Read Post

StatusCake

Read more about When AI Speeds Up Change, Knowing First Becomes the Constraint

How to Monitor SaaS Status in 2026 : A Complete Guide

Jan 14, 2026 By Hrishikesh Barua In IncidentHub

This is an updated and expanded version of the older guide. According to the 2025 State of SaaS report, organizations use an average of 106 SaaS apps. Staying on top of your SaaS vendors' status is as important as monitoring your own services. The Cloudflare, AWS, Azure, and Google Cloud outages in 2025 were strong reminders of this fact.

Read Post

IncidentHub

Read more about How to Monitor SaaS Status in 2026 : A Complete Guide

IT Monitoring News | January '26 Edition

Jan 13, 2026 By NiCE IT Mgmt In NiCE IT Mgmt

Latest updates, insights, events, and more regarding Microsoft SCOM, and Azure Monitor.

Read Post

NiCE IT Mgmt

Read more about IT Monitoring News | January '26 Edition

OpAMP Explained: Why OpenTelemetry Needed an Agent Management Protocol (and How We Use It)

Jan 13, 2026 By Tyler Helmuth In Honeycomb

OpenTelemetry makes it easy to produce and transmit any type of telemetry. In production environments, this often means deploying the OpenTelemetry Collector as an intermediary to process, enrich, and route telemetry data. As systems scale, so does this infrastructure—sometimes to hundreds or thousands of Collectors spread across environments.

Read Post

Honeycomb

Read more about OpAMP Explained: Why OpenTelemetry Needed an Agent Management Protocol (and How We Use It)

Cribl Search Pack for Sigma Rules

Jan 13, 2026 By Cribl In Cribl

Big alert. Old data. No time for a replay. In this video, learn how to run Sigma detections directly against object storage—S3, Azure Blob, or GCS—using Cribl Search. No rehydration, no re-ingest, no SIEM bill shock. Just click, run, and hunt across months or years of data with ready-to-use Sigma Packs.

View Video

Cribl

Read more about Cribl Search Pack for Sigma Rules

Top 15 Lumigo Competitors & Alternatives 2026

Jan 13, 2026 By Pavithra Parthiban In Atatus

Lumigo is a cloud-native observability platform designed primarily for serverless applications and microservices, providing distributed tracing, error detection, and performance monitoring. However, Lumigo may not meet every team's needs due to limitations in features, pricing, scalability, or support for other environments. Many organizations require Lumigo alternatives that provide broader infrastructure monitoring, more advanced analytics, or support for multi-cloud setups.

Read Post

Atatus

Read more about Top 15 Lumigo Competitors & Alternatives 2026

Not everything that breaks is an error: a Logs and Next.js story

Jan 13, 2026 By Sergiy Dybskiy In Sentry

Stack traces are great, but they only tell you what broke. They rarely tell you why. When an exception fires, you get a snapshot of the moment things went sideways, but the context leading up to that moment? Gone. That's where logs come in. A well-placed log can be the difference between hours of head-scratching and a five-minute fix. Let me show you what I mean with a real bug I encountered recently.

Read Post

Sentry

Read more about Not everything that breaks is an error: a Logs and Next.js story

Continuous Profiling Explained: Master Performance in Production

Jan 13, 2026 By Mohana Ayeswariya J In Atatus

Backend systems rarely fail in obvious ways. More often, they degrade over time. CPU usage slowly increases, request latency creeps up, and costs rise without a clear explanation. Metrics tell you something is wrong, traces show where requests go, but neither explains why your code behaves the way it does under real load. Continuous profiling fills that gap. Atatus continuous profiling runs automatically in production with minimal overhead.

Read Post

Atatus

Read more about Continuous Profiling Explained: Master Performance in Production

Bindplane + Oodle.ai: AI-Native Observability Meets AI-Driven Telemetry Pipelines

Jan 13, 2026 By Adnan Rahic &Sai Prameela Konduru In ObservIQ

Today, we’re excited to announce a new integration between Bindplane and Oodle.ai — combining an AI-driven, OpenTelemetry-native telemetry pipeline with an AI-native observability platform built for extreme scale. With Bindplane acting as the control plane for telemetry and Oodle.ai providing AI-powered analysis across logs, metrics, and traces, you get a single, intelligent, vendor-neutral pipeline from raw telemetry to actionable insight.

Read Post

ObservIQ

Read more about Bindplane + Oodle.ai: AI-Native Observability Meets AI-Driven Telemetry Pipelines

Optimizing BESS Operations: Real-Time Monitoring & Predictive Maintenance with InfluxDB 3

Jan 13, 2026 By Suyash Joshi In InfluxData

For IT and OT engineers managing Battery Energy Storage Systems (BESS) and other distributed energy resources (DER), the challenge isn’t just dealing with energy. It’s a data problem, or managing the massive stream of real-time telemetry these systems generate. For example, a BESS site produces a constant stream of time-series data from BMS, PCS, SCADA, EMS, and more, and operating it means ingesting, correlating, and acting on that data in real time. And this challenge changes with scope.

Read Post

InfluxData

Read more about Optimizing BESS Operations: Real-Time Monitoring & Predictive Maintenance with InfluxDB 3

Why Observability Budgets Keep Growing Even When IT Is Asked to Cut Costs

Jan 13, 2026 By Sofia Burton In LogicMonitor

Observability is the surprising budget line that isn’t shrinking. 96% of IT leaders expect observability budgets to hold steady or grow over the next 12 months. And 62% expect those budgets to increase regardless of broader IT budget cuts. Why? Because as infrastructure becomes more distributed and harder to manage, observability has shifted from a “nice to have” to a control point for cost, performance, and risk.

Read Post

LogicMonitor

Read more about Why Observability Budgets Keep Growing Even When IT Is Asked to Cut Costs

EventSentry Training [05-05]: HTTP Receiver (HEC) / Network Services

Jan 13, 2026 By NETIKUS.NET LTD In EventSentry

How to setup EventSentry to collect (receive) logs in JSON format from remove devices via the HTTPS protocol (aka HEC).

View Video

EventSentry

Read more about EventSentry Training [05-05]: HTTP Receiver (HEC) / Network Services

EventSentry Training [08-12]: Digitally Signing Event Logs

Jan 13, 2026 By NETIKUS.NET LTD In EventSentry

How to digitally sign event logs and Syslog data with EventSentry.

View Video

EventSentry

Read more about EventSentry Training [08-12]: Digitally Signing Event Logs

Getting the Right Signals: Mobile Observability with Embrace and SquaredUp

Jan 13, 2026 By John Hayes In Squared Up

More than half of all connections to web services now originate from mobile devices. Mobile apps are no longer peripheral - they are central to how businesses engage customers, deliver services, and generate revenue. Despite this shift, many organizations still rely on observability tools that are fundamentally server-centric. These platforms are adept at monitoring backend health, but they often fail to capture what’s happening at the edge - on the mobile device itself.

Read Post

Squared Up

Read more about Getting the Right Signals: Mobile Observability with Embrace and SquaredUp

OpenTelemetry Overview: Unifying Traces, Metrics, and Logs

Jan 13, 2026 By Staff Member In SolarWinds

The IT landscape has evolved rapidly, transitioning from monolithic applications to complex, distributed system architectures comprising microservices that run on platforms like Kubernetes. With this added complexity, simply checking if a server is running is no longer sufficient. As IT professionals, we need insight into what’s really happening inside these systems. That’s where observability comes in.

Read Post

SolarWinds

Read more about OpenTelemetry Overview: Unifying Traces, Metrics, and Logs

Reduce Wasted Spend with Datadog Kubernetes Autoscaling

Jan 13, 2026 By Datadog In Datadog

Balancing performance and cost in Kubernetes doesn’t have to be a tradeoff. In this short demo, you’ll see how Datadog helps teams rightsize workloads, reduce over-provisioning, and measure real cost savings using Kubernetes autoscaling.

View Video

Datadog

Read more about Reduce Wasted Spend with Datadog Kubernetes Autoscaling

IT Trends and Predictions for 2026 - SolarWinds TechPod 105

Jan 13, 2026 By solarwindsinc In SolarWinds

SolarWinds TechPod returns with its annual IT trends and predictions episode — and 2026 is all about Agentic AI. In this episode of SolarWinds TechPod, hosts Sean Sebring and Chrystal Taylor are joined by Sascha Giese (SolarWinds) and Lauren Okruch (SolarWinds Product Marketing) to break down how AI, ITSM, automation, governance, and resilience will shape IT operations in 2026.

View Video

SolarWinds

Read more about IT Trends and Predictions for 2026 - SolarWinds TechPod 105

EventSentry v6.0: New Features Overview

Jan 13, 2026 By NETIKUS.NET LTD In EventSentry

Introduction video showcasing the main new features in EventSentry v6.0, including native Azure log support, HEC, digital log signing, oauth support, location filtering, sigma and more.

View Video

EventSentry

Read more about EventSentry v6.0: New Features Overview

Clarity - Loved by Customers. Respected by Analysts.

Jan 13, 2026 By ValueOps by Broadcom In Broadcom

Clarity, a leading Strategic Portfolio Management (SPM) solution by Broadcom, closes the strategy-execution divide by connecting financial predictability to business outcomes for unmatched transparency. Track where the money goes with AI-driven traceability. Turn compliance into collaboration with a solution fit to your organization’s unique needs and processes – SPM, your way! – enhancing decision-making, resource utilization, and strategic alignment.

View Video

Broadcom

Read more about Clarity - Loved by Customers. Respected by Analysts.

Breaking the Iron Triangle: How AI-powered investigations change the economics of uptime

Jan 13, 2026 By Stephanie Closson In Grafana

In engineering, there's a concept known as the Iron Triangle. With three sides—cost, quality, time—it's a framework intended to help you prioritize different aspects of project management Want fast, high-quality features? It'll cost you. Need to keep costs down while maintaining quality? That'll take time. And if you're trying to move fast and cheap? Well, good luck with quality. For years, this has been the brutal reality of running services on the web.

Read Post

Grafana

Read more about Breaking the Iron Triangle: How AI-powered investigations change the economics of uptime

Reality Bytes: Waymo on the Tracks (2026 Predictions)

Jan 13, 2026 By Nexthink In Nexthink

The Matrix hits different.. when you're in the Matrix. The team rings in 2026 by reflecting on past predictions—and charting what’s next. From eerily accurate calls on AI consolidation to the unsettling prescience of The Matrix, the conversation looks ahead to a pivotal shift: from conversational AI to operational “do-bots,” the challenge of measuring real enterprise value, and the growing risks of over-automation.

View Video

Nexthink

Read more about Reality Bytes: Waymo on the Tracks (2026 Predictions)

What's New in VictoriaMetrics Cloud Q4 2025? New tiers, more deployment options, IaC and alerting rules.

Jan 13, 2026 By Jose Gomez-Selles In VictoriaMetrics

2025 has been quite a year! As we enter into 2026, we want to share all the great features that VictoriaMetrics Cloud has brought in the last quarter. Remember that this Quarterly Live Update is available in video format as well here: Let’s get to it!

Read Post

VictoriaMetrics

Read more about What's New in VictoriaMetrics Cloud Q4 2025? New tiers, more deployment options, IaC and alerting rules.

The Self-Aware Enterprise: Systems That Understand Themselves

Jan 13, 2026 By ScienceLogic In ScienceLogic

Automation revealed truth. AI learned to reason from it. Now, systems are beginning to understand themselves. The self-aware enterprise isn’t a vision of autonomy. It’s a model of awareness. It sees, understands, and acts with precision based on verified knowledge of how it operates. This is the next evolution of intelligence in IT. Not artificial. Not imagined. Built.

Read Post

ScienceLogic

Read more about The Self-Aware Enterprise: Systems That Understand Themselves

How to debug a Next.js production bug with Logs and Sentry

Jan 13, 2026 By Sentry In Sentry

Stack traces tell you what broke. They rarely tell you why. In this video, Serge walks through a real Next.js production bug that only affected Firefox and Safari. The error showed up clearly in Sentry, but the stack trace alone wasn’t enough to explain what was going wrong. The missing piece turned out to be logs. You’ll see how adding logs to a Next.js API route exposed unexpected request data, how those logs connected back to traces, and how that context made the root cause obvious and easy to fix.

View Video

Sentry

Read more about How to debug a Next.js production bug with Logs and Sentry

Top cloud cost management trends in 2026

Jan 12, 2026 By Sinjan Ballav In ManageEngine

Cloud spending has shifted from an IT afterthought to a strategic performance lever. As organizations head into 2026, many IT teams are rethinking how they use, govern, and optimize cloud resources, not just how much they consume. Enterprises, startups, and MSPs are entering an efficiency-first era, fueled by multi-cloud adoption, distributed architectures, and a growing need to balance performance with predictable budgets. The question is no longer: How much are we spending?

Read Post

ManageEngine

Read more about Top cloud cost management trends in 2026

Cribl Search Pack for Missing Logs

Jan 12, 2026 By Cribl In Cribl

Ever run a SIEM search only to see nothing for your firewall logs? In this video, we show a smarter way to detect when log sources stop sending data using Cribl Lake, Cribl Search, and Cribl Stream. Learn how to track “last seen” times, build efficient aggregations, and get real-time alerts—without burning SIEM resources or storage.

View Video

Cribl

Read more about Cribl Search Pack for Missing Logs

Easy Guide for Connecting Redis to a Grafana Data Source

Jan 12, 2026 By Benjamin Pitts In MetricFire

Redis is a widely used in-memory data store, commonly deployed as a cache, session store, message broker, or fast key-value database. Because Redis often sits on the critical path of an application, having visibility into its behavior (memory usage, client connections, command throughput, cache efficiency) is essential for troubleshooting and performance tuning.

Read Post

MetricFire

Read more about Easy Guide for Connecting Redis to a Grafana Data Source

Automate flaky test fixes with the Bits AI Dev Agent and Test Optimization

Jan 12, 2026 By Eric Metaj In Datadog

Flaky tests are a significant source of inefficiency that impacts many engineering teams. Along with failing your build, they interrupt your entire development flow, generate excessive CI/CD noise, and, critically, compromise developer trust in the test suite itself. Datadog Test Optimization enables you to manage test suites at scale by pinpointing the flakiest tests, analyzing their history across hundreds of runs, and automatically surfacing the root cause.

Read Post

Datadog

Read more about Automate flaky test fixes with the Bits AI Dev Agent and Test Optimization

How we built an AI SRE agent that investigates like a team of engineers

Jan 12, 2026 By Daniel Shan In Datadog

We built Bits AI SRE to help engineers investigate and solve production incidents, one of the most difficult aspects of operating distributed systems today. As environments grow more dynamic and complex, resolving issues becomes more challenging. Failures now span more services, involve noisier signals, and encompass larger volumes of telemetry data, making it hard for on-call engineers to find root causes quickly. Today, Bits AI SRE is already helping teams decrease time to resolution by up to 95%.

Read Post

Datadog

Read more about How we built an AI SRE agent that investigates like a team of engineers

Heroku Monitoring Add-ons 2026 and Hosted Graphite

Jan 12, 2026 By MetricFire Blogger In MetricFire

Monitoring performance of Heroku applications helps improve user experience. This blog post covers Heroku monitoring add-ons and explores why Hosted Graphite is the best choice in 2026. We'll discuss the benefits and setup process of the Hosted Graphite add-on. We'll also discuss future trends in Heroku monitoring.

Read Post

MetricFire

Read more about Heroku Monitoring Add-ons 2026 and Hosted Graphite

The 54% Improvement Playbook: How Top Performers Integrate GenAI into ITSM

Jan 12, 2026 By solarwindsinc In SolarWinds

Don't just read the report—learn how to replicate its most impressive results. In our 2025 State of ITSM Report, a select group of top-performing organizations achieved a staggering 54.3% reduction in resolution time by strategically integrating GenAI. This live session moves beyond the data to share their playbook. We'll provide a step-by-step guide on how to pair GenAI with foundational ITSM practices and demonstrate how to weave these tools into your team's daily workflows to achieve maximum efficiency.

View Video

SolarWinds

Read more about The 54% Improvement Playbook: How Top Performers Integrate GenAI into ITSM

Real User Monitoring Dashboard

Jan 12, 2026 By Uptime Website Monitoring In uptime

Get to know the Uptime.com RUM dashboard. No two websites – or report baselines – are exactly the same. However, all businesses benefit from the ability to leverage real user experience data to optimize their sites for speed, errors, and more.

View Video

uptime

Read more about Real User Monitoring Dashboard

What's New in VictoriaMetrics Cloud Q3 2025 - Cloud Database

Jan 12, 2026 By VictoriaMetrics In VictoriaMetrics

Join Marc Sherwood and Jose Gomez-Selles as they unveil the significant updates to VictoriaMetrics Cloud from Q3 2025 and share a glimpse into the exciting roadmap for what's coming next! This session is packed with new features designed to make your monitoring experience more robust, user-friendly, and cost-effective. In this video, you'll discover: Expansion to Asia! VictoriaMetrics Cloud now has a brand new region on AWS ap-southeast-1 (Singapore) in Asia Pacific, bringing lower latency and regional data sovereignty closer to your teams and deployments.

View Video

VictoriaMetrics

Read more about What's New in VictoriaMetrics Cloud Q3 2025 - Cloud Database

How to Monitor Network Performance for Multi-Site Businesses

Jan 12, 2026 By Andrii Kernitskyi In Obkio

When you’re a business managing network performance across 15 branch offices in different cities, you’re going to see some blind spots. Your headquarters may experience consistent connectivity, while remote location experience unpredictable slowdowns that can affect your daily operations.

Read Post

Obkio

Read more about How to Monitor Network Performance for Multi-Site Businesses

Looking back on 2025, and what's next

Jan 11, 2026 By Max Rozen In OnlineOrNot

Continuing on with the tradition I started to wrap up 2024, in this article I'll go over what's new in OnlineOrnot from 2025.

Read Post

OnlineOrNot

Read more about Looking back on 2025, and what's next

Intercom outage - January 9th, 2026

Jan 10, 2026 By Colin Bartlett In StatusGator

Ever had that sinking feeling when your help desk just stops responding, but the official status page says everything is “up and running”? That’s exactly what happened on January 9, 2026, when Intercom – one of the world’s most popular support tools – hit a major snag. While hundreds of companies were left staring at loading circles, StatusGator was already on the case.

Read Post

StatusGator

Read more about Intercom outage - January 9th, 2026

RapidSpike Status Pages: Clearer, Smarter, More Transparent

Jan 9, 2026 By Georgina Grant-Muller In RapidSpike

Clear communication is everything when it comes to service availability. Whether you’re managing a critical website, a SaaS platform, or customer-facing infrastructure, your users expect clarity, honesty, and real-time insight when things don’t go to plan. That’s why we’re excited to introduce the newly refreshed RapidSpike Status Pages, redesigned to look better, work smarter, and provide deeper, more meaningful insight at a glance.

Read Post

RapidSpike

Read more about RapidSpike Status Pages: Clearer, Smarter, More Transparent

Why Synthetic Tracing Delivers Better Data, Not Just More Data

Jan 9, 2026 By Gerardo Dada In Catchpoint

In modern observability practices, distributed tracing has become table stakes. Most application performance monitoring (APM) platforms encourage an “instrument everything” approach: Deploy an SDK or agent, hook into every service call and capture every user interaction at scale. On paper, this sounds like complete visibility. In practice, it can turn into a costly firehose of data with diminishing returns.

Read Post

Catchpoint

Read more about Why Synthetic Tracing Delivers Better Data, Not Just More Data

Beyond the Blue Link: UX Patterns for Google's AI Overviews, AI Mode & Answer Engines

Jan 9, 2026 By Germain UX Team In Germain UX

The blue link is dying—but not in the way we expected. When Google’s AI Overviews began appearing at the top of the search results page, the SEO community panicked. Publishers watched click-through rates plummet. The Pew Research Center confirmed their fears: searchers who encounter an AI summary are half as likely to click on traditional search results (8% vs. 15%).

Read Post

Germain UX

Read more about Beyond the Blue Link: UX Patterns for Google's AI Overviews, AI Mode & Answer Engines

Website Optimizations

Jan 9, 2026 By Pēteris Caune In Healthchecks

Over the last few weeks, I indulged myself in doing a few “nice to have” website optimizations. They were.

Read Post

Healthchecks

Read more about Website Optimizations

Types of Cyber Security Attacks

Jan 9, 2026 By Staff Contributor In SolarWinds

Damaging cyber attacks are a rising concern as organizations increasingly rely on digital technology for managing sensitive data and running core business operations. While technology can increase business efficiency, without security measures in place, a digital-first approach can end up introducing vulnerabilities and putting data at risk.

Read Post

SolarWinds

Read more about Types of Cyber Security Attacks

A better way to prioritize feature backlogs: the CERB scoring method

Jan 9, 2026 By Dave Thompson In Grafana

When you're on a software team, planning for the weeks and months to come is always a challenge. You have to balance deep feature backlogs, business and leadership requests, customer requests, and operational interruptions. Effective planning requires a way to prioritize the backlog, set realistic roadmap goals, and justify decisions.

Read Post

Grafana

Read more about A better way to prioritize feature backlogs: the CERB scoring method

InfluxDB For Beginners

Jan 9, 2026 By InfluxData In InfluxData

In this video, we go over getting started with InfluxDB, including downloading, installing, writing, and reading data with a couple different clients.

View Video

InfluxData

Read more about InfluxDB For Beginners

Guide to Sending Custom Metrics From Your Heroku Application

Jan 9, 2026 By Benjamin Pitts In MetricFire

Heroku makes it easy to deploy and operate applications without managing servers, but understanding how your application behaves internally still requires instrumentation. Platform metrics like CPU usage, memory consumption, and router request/status counts are useful, but they don’t tell you how long your code takes to run, when your app throws errors, or whether users are interacting with key features.

Read Post

MetricFire

Read more about Guide to Sending Custom Metrics From Your Heroku Application

New in Bindplane: Permalinks

Jan 9, 2026 By Cole Laven In ObservIQ

I’m excited to announce a new feature in Bindplane: Permalinks. Available in Bindplane Cloud right now! Permalinks will be shipped in version v1.97.0 and above in Self-hosted Bindplane. Permalinks make it easy to share a single URL that takes teammates, support engineers, or other stakeholders directly to the exact view you’re looking at. No extra navigation, no guessing, and no “can you click over here?” moments.

Read Post

ObservIQ

Read more about New in Bindplane: Permalinks

Top 7 Kubernetes Add-ons

Jan 9, 2026 By Staff Contributor In SolarWinds

The open-source Kubernetes platform is designed to help simplify application deployment through Linux containers. It supports tasks like deploying workloads in the form of pods, clustering nodes, managing container runtimes, and tracking resources. The Kubernetes microservices system has risen in popularity over the last several years as an easy way to support, scale, and manage applications.

Read Post

SolarWinds

Read more about Top 7 Kubernetes Add-ons

Cribl Search Pack for Windows

Jan 9, 2026 By Cribl In Cribl

Get instant visibility into Windows event logs, system_state, process events and AD logs. the Cribl Search pack for Windows highlights performance and security signals at a glance, helping teams quickly spot anomalies.

View Video

Cribl

Read more about Cribl Search Pack for Windows

Vibe coding tools observability with VictoriaMetrics Stack and OpenTelemetry

Jan 9, 2026 By Alexander Marshalov In VictoriaMetrics

AI-powered coding assistants have transformed how developers write software. Tools like Claude Code, OpenAI Codex, Gemini CLI, Qwen Code, and OpenCode have introduced what many call “vibe coding” — a new paradigm where users describe their intent and AI agents handle the implementation details. But as these tools become integral to development workflows, a critical question emerges: how do we understand what’s happening under the hood?

Read Post

VictoriaMetrics

Read more about Vibe coding tools observability with VictoriaMetrics Stack and OpenTelemetry

Design effective executive dashboards with Datadog

Jan 9, 2026 By MacKenna Kelleher In Datadog

In most organizations, leaders are surrounded by data: revenue reports, customer analytics, uptime metrics, support tickets, and more.

Read Post

Datadog

Read more about Design effective executive dashboards with Datadog

Lightrun MCP: Your AI Assistant Now Debugs and Validates Production Code

Jan 9, 2026 By Lightrun In Lightrun

Intermittent production bugs are hard to debug and rarely reproduce locally. Teams fall into a loop of adding logs, and every rollback slows them down. In this demo, R&D team leads Maor Yaffe and Or Golan show how an AI assistant can verify production issues using real runtime data, without redeploying. By connecting Cursor to Lightrun MCP, the agent inspects live production behavior, collects real variable values, and confirms the root cause with evidence instead of assumptions.

View Video

Lightrun

Read more about Lightrun MCP: Your AI Assistant Now Debugs and Validates Production Code

Datadog integrations 2025 recap: Observability for AI, security, and hybrid cloud

Jan 9, 2026 By Alex Guo In Datadog

The year 2025 marked a major milestone in the Datadog integrations ecosystem as we surpassed 1,000 integrations. Along the way, we also added over 110 new technology partners and expanded coverage across the fastest growing software categories, including AI, distributed security, hybrid infrastructure, and data intelligence. This recap highlights the most impactful integrations we released this year and how they connect to these broader technology trends.

Read Post

Datadog

Read more about Datadog integrations 2025 recap: Observability for AI, security, and hybrid cloud

Understanding Kubernetes Performance: Top Tips From Experts

Jan 9, 2026 By Staff Contributor In SolarWinds

The way Kubernetes works “under the hood” dictates which components are more important for improving performance and which are less important. So let’s talk about the Kubernetes internals first.

Read Post

SolarWinds

Read more about Understanding Kubernetes Performance: Top Tips From Experts

Top tips: RAG isn't the problem, context is. Here are 3 fixes.

Jan 8, 2026 By Alsherin In ManageEngine

Top Tips is a weekly column where we highlight what’s trending in the tech world and list ways to explore these trends. This week, we’ll be talking about how we can improve our retrieval-augmented generation (RAG) systems using contextual engineering. Prompt engineering has gained a lot of attention in the past year, and it’s finally time to move on to a better experience that transforms the way AI results are provided to us.

Read Post

ManageEngine

Read more about Top tips: RAG isn't the problem, context is. Here are 3 fixes.

IT Observability in 2026: Lessons From the Past Year

Jan 8, 2026 By Kristy Slimmer In Galileo

As IT organizations enter 2026, many of the assumptions around monitoring and observability have already been tested. Throughout 2025, infrastructure teams made it clear that visibility alone is not enough. Alerts without context, short data retention, and fragmented tools limited teams’ ability to explain behavior, validate changes, and plan with confidence. This article looks at what emerged from those experiences and how observability expectations continue to shift.

Read Post

Galileo

Read more about IT Observability in 2026: Lessons From the Past Year

VirtualMetric DataStream + Amazon Security Lake: OCSF-Ready Security Data Without Custom Pipelines

Jan 8, 2026 By VirtualMetric In VirtualMetric

Security teams are increasingly turning to Amazon Security Lake to consolidate security telemetry across cloud, network, and on-prem environments. Security Lake provides a unified, OCSF-based data repository that powers analytics, threat hunting, and machine learning across AWS services and third-party tools. But to take advantage of Security Lake’s capabilities, organizations must deliver clean, normalized, OCSF-compliant data, and this is where challenges arise.

Read Post

VirtualMetric

Read more about VirtualMetric DataStream + Amazon Security Lake: OCSF-Ready Security Data Without Custom Pipelines

Context is King: Why Network AI Needs Domain Knowledge to Work

Jan 8, 2026 By Phil Gervasi In Kentik

Generic AI fails in network operations because it lacks the “institutional knowledge” of your specific environment and business priorities. Learn how Kentik’s Custom Network Context encodes your unique operational reality into AI Advisor, turning a generic chatbot into a context-aware teammate.

Read Post

Kentik

Read more about Context is King: Why Network AI Needs Domain Knowledge to Work

How to Integrate Grafana with Home Assistant

Jan 8, 2026 By Community In InfluxData

This post covers how to get started with Home Assistant and Grafana, including setting up InfluxDB and Grafana with Docker, configuring InfluxDB to receive data from Home Assistant, and creating a Grafana dashboard to visualize your data. It provides a comprehensive guide for real-time monitoring and analysis of Home Assistant data. In this tutorial, you’ll learn how to integrate Grafana with Home Assistant using InfluxDB.

Read Post

InfluxData

Read more about How to Integrate Grafana with Home Assistant

Make Your Engineering Processes Resilient. Not Your Opinions About AI

Jan 8, 2026 By James Barnes In StatusCake

Why strong reviews, accountability, and monitoring matter more in an AI-assisted world Artificial intelligence has become the latest fault line in software development. For some teams, it’s an obvious productivity multiplier. For others, it’s viewed with suspicion. A source of low-quality code, unreviewable pull requests, and latent production risk. One concern we hear frequently goes something like this: It’s an understandable fear; and also the wrong conclusion.

Read Post

StatusCake

Read more about Make Your Engineering Processes Resilient. Not Your Opinions About AI

Unity SDK 4.0.0: Console support, logs, user feedback and more

Jan 8, 2026 By Stefan Jandl In Sentry

We just released the Sentry SDK for Unity 4.0.0 , our biggest update yet. This major release brings comprehensive gaming console support, structured logging, user feedback capabilities, and significant improvements to help you build better games across all platforms. Here's what's new.

Read Post

Sentry

Read more about Unity SDK 4.0.0: Console support, logs, user feedback and more

Logs on Sentry in 60 seconds

Jan 8, 2026 By Sentry In Sentry

Aggregate and query logs on Sentry. Add logs to your errors, errors to not only make debugging easier, but also give valuable context to Seer AI. Logs help you and Seer better understand issues.

View Video

Sentry

Read more about Logs on Sentry in 60 seconds

Sending Custom Application Metrics to MetricFire's Hosted Graphite

Jan 8, 2026 By Benjamin Pitts In MetricFire

In this article, we’ll show how easy it is to send custom application metrics directly to MetricFire's public carbon endpoint. We’ll build a small Flask application, emit a handful of practical metrics, and generate local traffic to demonstrate how quickly meaningful data can flow from your code to your dashboards.

Read Post

MetricFire

Read more about Sending Custom Application Metrics to MetricFire's Hosted Graphite

SSH Check Overview

Jan 8, 2026 By Uptime Website Monitoring In uptime

In this video, learn how to set up and configure SSH checks using Uptime.com. We discuss the frequency options, the importance of Secure Shell (SSH) for secure data communication, and step-by-step instructions for creating a new SSH check in your account. Discover how to set check intervals, configure alert contacts, specify monitoring locations, and ensure your probe servers are whitelisted. Perfect for ensuring your server's remote login capabilities are continuously monitored and secure.

View Video

uptime

Read more about SSH Check Overview

Next.Js with Sentry in 60 seconds

Jan 8, 2026 By Sentry In Sentry

Get started with Sentry for Next.js in just one minute.

View Video

Sentry

Monitoring

Read more about Next.Js with Sentry in 60 seconds

Session Replay in 60 seconds

Jan 8, 2026 By Sentry In Sentry

Learn about Sentry's Session Replay in 60 seconds.

View Video

Sentry

Monitoring

Read more about Session Replay in 60 seconds

The Logic of AI: Why Machines Don't Think, They Reason

Jan 8, 2026 By ScienceLogic In ScienceLogic

Automation revealed truth. AI extends it. Machines don’t invent knowledge. They reason from verified data. The result is operational foresight that’s traceable, explainable, and trusted.

Read Post

ScienceLogic

Read more about The Logic of AI: Why Machines Don't Think, They Reason

How to prevent outdated server inventory risks with efficient server monitoring

Jan 7, 2026 By Geoffrin Edwin In Site24x7

At any point in time, your IT teams are constantly working on performance monitoring, security patching, scaling, and related activities. Most teams overlook one critical pillar: a reliable and up-to-date server inventory. Why did we emphasize the phrase "reliable and up-to-date"? Because there are still teams using a spreadsheet that was last updated years ago when a server inventory report is requested. What follows when you do not maintain an updated server inventory repository is.

Read Post

Site24x7

Read more about How to prevent outdated server inventory risks with efficient server monitoring

Implement dbt data quality checks with dbt-expectations

Jan 7, 2026 By Tom Sobolik In Datadog

dbt is one of the most popular solutions for data transformations and modeling. Many commercial data pipelines rely on dozens, or even hundreds, of individual dbt jobs. Data engineers, data platform engineers, and analytics engineers who own these pipelines need to maintain a testing framework to prevent mistakes in data processing that can compromise analysis.

Read Post

Datadog

Read more about Implement dbt data quality checks with dbt-expectations

Bring faster visibility into AWS Lambda functions with remote instrumentation

Jan 7, 2026 By Tal Usvyatsky In Datadog

Comprehensive observability is critical for running performant, reliable, and secure serverless workloads. However, configuring and maintaining that visibility across hundreds or thousands of serverless functions can be difficult to scale and sustain. Developers across teams often manage serverless functions using different infrastructure as code (IaC) frameworks, as well as different review, deployment, and update processes.

Read Post

Datadog

Read more about Bring faster visibility into AWS Lambda functions with remote instrumentation

Easiest Way to Connect InfluxDB to a Grafana Data Source

Jan 7, 2026 By Benjamin Pitts In MetricFire

InfluxDB is a widely used time-series database designed for storing and querying metrics, events, and telemetry data. It’s commonly used for infrastructure monitoring, application instrumentation, and IoT-style workloads where time-based data is central. In many environments, InfluxDB already exists as part of the monitoring or data collection pipeline, and the primary need is simply to visualize that data effectively.

Read Post

MetricFire

Read more about Easiest Way to Connect InfluxDB to a Grafana Data Source

New Year, New Telemetry: Resolve to Stop Breaking Dashboards

Jan 7, 2026 By Mezmo In Mezmo

It's 2026. Your New Year's resolution was to finally migrate to OpenTelemetry. But you're staring at dozens of dashboards that depend on your current data format, and that migration deadline is looming... Sound familiar? If you're an SRE or Platform Engineer facing a top-down OTel mandate, you're not alone. The challenge isn't just about adopting a new standard—it's about doing so without disrupting the observability systems your team depends on every day.

Read Post

Mezmo

Read more about New Year, New Telemetry: Resolve to Stop Breaking Dashboards

A Bright Outlook: Building Operational Resilience for the Year Ahead

Jan 7, 2026 By Teneo In Teneo

As we step into a new year, one truth stands firm in financial services: resilience isn’t optional – it’s expected. Markets fluctuate, regulations evolve, and technology accelerates. Amid this complexity, IT leaders carry the responsibility of ensuring that operations don’t just survive disruption, they thrive through it.

Read Post

Teneo

Read more about A Bright Outlook: Building Operational Resilience for the Year Ahead

Build custom apps in seconds with conversational AI in App Builder

Jan 7, 2026 By Datadog In Datadog

Using a drag-and-drop interface, engineering teams can create apps that support troubleshooting, improve day-to-day operations, and offer self-service access without leaving Datadog. With the new conversational AI feature, teams can turn an idea into a working app in seconds. Watch the video to see how it works..

View Video

Datadog

Read more about Build custom apps in seconds with conversational AI in App Builder

Grafana Tempo: vParquet5 is coming soon (January 2026 Community Call)

Jan 7, 2026 By Grafana In Grafana

vParquet5 is coming soon, learn about all the improvements and how to use them Have questions? Please bring them! Can't comment in the chat? You may need to create a channel -- you can do this by clicking your photo in the top right corner. Grafana Cloud is the easiest way to get started with Grafana dashboards, metrics, logs, traces, and profiles. Our forever-free tier includes access to 10k metrics, 50GB logs, 50GB traces and more.

View Video

Grafana

Read more about Grafana Tempo: vParquet5 is coming soon (January 2026 Community Call)

How to Ensure AI-Generated Code is Reliable with Runtime Context

Jan 7, 2026 By Gideon Freud In Lightrun

TLDR: AI coding assistants have sped up code delivery, but created a validation gap. Historic telemetry and static analysis cannot predict the behavior of unfamiliar, high-volume code. Lightrun’s Runtime Context MCP closes that gap, allowing AI assistants to verify behavior before it breaks, and resolve issues in real time.

Read Post

Lightrun

Read more about How to Ensure AI-Generated Code is Reliable with Runtime Context

Grafana dashboards: tips for optimizing query performance

Jan 7, 2026 By Rao Komar In Grafana

Even with a powerful database or visualization layer, performance can suffer if queries aren’t optimized or system settings aren’t tuned. The new Mimir Query Engine in Grafana Cloud improves query efficiency, but there are still best practices you can follow to keep dashboards fast and responsive—whether your data source is hosted in Grafana Cloud or running on-premises.

Read Post

Grafana

Read more about Grafana dashboards: tips for optimizing query performance

Building Operational Resilience for the Year Ahead with Teneo's Digital Employee Experience (DEX)

Jan 7, 2026 By Teneo In Teneo

Read Post

Teneo

Read more about Building Operational Resilience for the Year Ahead with Teneo's Digital Employee Experience (DEX)

Fleet Management: Manage your telemetry collectors at scale

Jan 7, 2026 By Coralogix In Coralogix

In this video, we introduce Fleet Management and how it helps teams control their telemetry estate as it scales. See how you can centrally manage collectors and agents, standardize configurations across environments, and roll out updates confidently, reducing operational effort and risk.

View Video

Coralogix

Read more about Fleet Management: Manage your telemetry collectors at scale

Trace-connected structured logging with LogTape and Sentry

Jan 7, 2026 By Kyle Tryon In Sentry

As our applications grow from simple side projects into complex distributed systems with many users, the “old way” of console.log debugging isn’t going to hold up. To build truly observable systems, we have to transition from simple text logs to structured, queryable, trace-connected events.

Read Post

Sentry

Read more about Trace-connected structured logging with LogTape and Sentry

Splunk Cloud Platform Migration Process

Jan 7, 2026 By Splunk In Splunk

Decide on your migration strategy, optimize and move your data sources, searches, users, apps and knowledge objects to the Splunk Cloud Platform.

View Video

Splunk

Read more about Splunk Cloud Platform Migration Process

How to visualize your 3CX contact center phone system with Grafana

Jan 7, 2026 By Sofia Neroda In Grafana

Note: this post was co-authored by Nicholas Borg, 3CX Product Manager. 3CX provides a robust, flexible IP PBX platform used by organizations of all sizes to power their contact centers. It offers detailed call activity, agent performance metrics, and operational insights — all of which become even more powerful when visualized.

Read Post

Grafana

Read more about How to visualize your 3CX contact center phone system with Grafana

Sponsored Post

Best Downdetector Alternatives for Outage Monitoring in 2026

Jan 6, 2026 By Nuno Tomas In isDown

To keep operations running, businesses and individuals increasingly rely on online services. When outages occur, having the right tools to detect and respond quickly is essential. Outage monitoring platforms provide real-time insights into service disruptions, helping minimize downtime and maintain productivity. While Downdetector is a widely recognized platform, its focus on consumer-level features may not fully meet business needs. Organizations relying on multiple third-party services require tools with advanced capabilities like deeper insights, customizable notifications, and seamless integrations.

Read Post

isDown

Read more about Best Downdetector Alternatives for Outage Monitoring in 2026

Fair usage limits: a safer way to scale observability

Jan 6, 2026 By Ofri Grushka In Coralogix

For the past several years, Coralogix customers have used the platform to ingest, process, and analyze large volumes of observability data without the presence of artificial barriers or unexpected constraints. This flexibility has enabled teams to experiment freely, evolve their architectures, and scale smoothly alongside their systems.

Read Post

Coralogix

Read more about Fair usage limits: a safer way to scale observability

Agentic AI in SolarWinds Observability Self-Hosted with Demo (2026)

Jan 6, 2026 By solarwindsinc In SolarWinds

In this video, SolarWinds Tech Evangelist Chrystal Taylor and Staff Technical Trainer Cheryl Nomanson walk through a hands-on demo of the SolarWinds AI Agent, currently available for tech preview in SolarWinds Observability Self-Hosted.

View Video

SolarWinds

Read more about Agentic AI in SolarWinds Observability Self-Hosted with Demo (2026)

How to Test Network Performance: 8 Testing Methods + Tools (2026 Guide)

Jan 6, 2026 By Alyssa Lamberti In Obkio

Network performance directly impacts business productivity, user experience, and revenue. When applications lag, video calls freeze, or file transfers stall, the root cause often lies in untested network infrastructure. Yet many organizations monitor their networks reactively—only testing performance after problems emerge. This article shows you how to proactively test network performance using proven methodologies that identify issues before they affect users.

Read Post

Obkio

Read more about How to Test Network Performance: 8 Testing Methods + Tools (2026 Guide)

Why Security Data Collection Architecture Became a Core SOC Concern

Jan 6, 2026 By VirtualMetric In VirtualMetric

Security operations teams depend on telemetry from everywhere: endpoints, identity systems, cloud platforms, networks, applications, and security tools. This data underpins detection, investigation, compliance, and response.

Read Post

VirtualMetric

Read more about Why Security Data Collection Architecture Became a Core SOC Concern

A Guide to Regression Analysis with Time Series Data

Jan 6, 2026 By Community Developer In InfluxData

Regression analysis with time series data in Python provides a basis for understanding how values change over time. By following this guide, you’ll understand regression as applied to time series data, how to prepare it in Python, and how to create regression models that’ll help discover trends and influence decisions. With the vast amount of time series data generated, captured, and consumed daily, how can you make sense of it?

Read Post

InfluxData

Read more about A Guide to Regression Analysis with Time Series Data

5 Observability & AI Trends Making Way for an Autonomous IT Reality in 2026

Jan 6, 2026 By LogicMonitor In LogicMonitor

IT operations are changing faster than most people realize, making autonomous IT a 2026 reality, not a distant vision. Your team monitors tens of thousands of metrics, ingests terabytes of logs, and generates thousands of alerts daily. And somehow, you still find out about outages from customers before you see them in your tools. That gap between having visibility and actually understanding what’s happening has become the central problem.

Read Post

LogicMonitor

Read more about 5 Observability & AI Trends Making Way for an Autonomous IT Reality in 2026

Another year, another $750,000 to Open Source maintainers

Jan 6, 2026 By Chad Whitacre In Sentry

Bored yet? 2025 was the fifth year in a row (2024, 2023, 2022, 2021) that Sentry gave a pretty hefty chunk of change to the maintainers of the Open Source software that we rely on and love. This is our first report since we launched the Open Source Pledge, which brings together companies that share our respect for the independent maintainers in the community. Pledge members have collectively paid $4.5M to Open Source maintainers and foundations since launch. No more excuses!

Read Post

Sentry

Read more about Another year, another $750,000 to Open Source maintainers

AIOps Use Cases for IT

Jan 6, 2026 By Renuka Suresh In HEAL Software

Core thesis: AIOps solves the enterprise gap between telemetry volume and operational decisions by converting noisy signals into prioritized incidents with probable causality, so recovery gets faster and repeat failures stop.

Read Post

HEAL Software

Read more about AIOps Use Cases for IT

Observability Beyond Kubernetes: eBPF Magic

Jan 6, 2026 By Coroot In Coroot

Alex chats with Kris Buytaert, Co-founder of DevOpsDays, O11y, Inuits and pivotal instigator of the movement about why he loves using Coroot.

View Video

Coroot

Read more about Observability Beyond Kubernetes: eBPF Magic

Automating BGP Troubleshooting with Kentik AI Advisor

Jan 6, 2026 By Kentik In Kentik

In this demo, we use Kentik AI Advisor to troubleshoot a real-world BGP misconfiguration that brings down a peering session with a transit provider. You’ll see how AI Advisor works both as a dedicated page and as an in-portal overlay, using natural language to identify the affected interface, correlate SNMP and syslog data, and pinpoint a maximum-prefix issue as the root cause. Then we accelerate and standardize the workflow with custom network context and AI-powered runbooks, so every engineer can troubleshoot BGP alerts like an expert.

View Video

Kentik

Read more about Automating BGP Troubleshooting with Kentik AI Advisor

Ep 24: Governing AI in the age of agentic systems and Model Context Protocol

Jan 6, 2026 By Sumo Logic, Inc. In Sumo Logic

On this episode of Masters of Data, we unpack David's new white paper on AI governance for agentic systems. He explains model context protocol (MCP) as "APIs for agents", how AI systems talk and execute tasks. The catch? Autonomous agents are insider threats that move fast and cause serious damage. David introduces the Model Control Plane (MoCop), a twelve-pillar framework designed to prevent your AI from going rogue. We cover his roadmap for security leaders to build real controls and telemetry. His advice: treat agents like interns with root access. Get ahead of this before your agents do.

View Video

Sumo Logic

Read more about Ep 24: Governing AI in the age of agentic systems and Model Context Protocol

Understanding Lighthouse: Has a Viewport Meta Tag

Jan 6, 2026 By Todd H. Gardner In Request Metrics

You ran Lighthouse and got a passing audit for “Has a tag with width or initial-scale.” Great. But do you know what happens when it’s missing? Your users wait an extra 300 milliseconds on every single tap. On mobile, that’s an eternity.

Read Post

Request Metrics

Read more about Understanding Lighthouse: Has a Viewport Meta Tag

From Compliance to Confidence: Earning Trust in a World That Never Stops Changing

Jan 6, 2026 By ScienceLogic In ScienceLogic

Compliance has always been a necessity, but for many organizations, it has also been a burden. Reports, audits, manual reviews, and spreadsheets create a cycle of looking backward rather than moving forward. As systems become more dynamic, that lag between compliance checks and real-world change grows wider, creating risk that traditional methods can’t close. The goal now isn’t to check the box.

Read Post

ScienceLogic

Read more about From Compliance to Confidence: Earning Trust in a World That Never Stops Changing

Prometheus Native Histograms FTW | Big Tent S3 Ep2

Jan 6, 2026 By Grafana In Grafana

From Grafana Lab's Big Tent podcast - season 3, episode 2. Thanks for watching!

View Video

Grafana

Read more about Prometheus Native Histograms FTW | Big Tent S3 Ep2

The Prometheus Origin Story | Big Tent S3 Ep2

Jan 6, 2026 By Grafana In Grafana

From Grafana Lab's Big Tent podcast - season 3, episode 2. Thanks for watching!

View Video

Grafana

Read more about The Prometheus Origin Story | Big Tent S3 Ep2

Sponsored Post

Essential digital experience metrics for development teams

Jan 5, 2026 By Laura Marwick In Raygun

For the team that's down in the trenches untangling legacy code, writing unit tests, and just trying to come up with sensible variable names, it's easy to lose sight of the other end of the process, where code meets customer. You test, you deploy, nothing breaks, and you move on. However, it's just as important to keep an eye on code quality in production, and how it's experienced. Experience, though, is hard to quantify. What do you measure? How do you measure it? How do you improve it? And why do you care? We lay out answers in this post.

Read Post

Raygun

Read more about Essential digital experience metrics for development teams

Auvik Named a Leader Across G2's Winter 2026 Reports for Network Management

Jan 5, 2026 By Bob Wientzen In Auvik

In G2’s Winter 2026 reports, Auvik earned top recognition as a leader in network management tools across small-business, mid-market, and enterprise categories. IT professionals rated Auvik highly for implementation, usability, results, relationship, and overall Grid® performance, reflecting one thing above all: real-world trust from the IT professionals who use Auvik every day.

Read Post

Auvik

Read more about Auvik Named a Leader Across G2's Winter 2026 Reports for Network Management

OpenTelemetry Collector Contrib - A Hands-on Guide

Jan 5, 2026 By Dhruv Ahuja In SigNoz

As application systems grow more complex, it becomes ever more important to understand how services interact across distributed systems. Observability sheds light on the behavior of instrumented applications and the infrastructure they run on. This enables engineering teams to gain better track system health and prevent critical failures. OpenTelemetry (OTel) has standardized how we generate and transmit telemetry, and the OpenTelemetry Collector is the engine that processes and export this data.

Read Post

SigNoz

Read more about OpenTelemetry Collector Contrib - A Hands-on Guide

What is OTLP and How It Works Behind the Scenes

Jan 5, 2026 By Dhruv Ahuja In SigNoz

If you have worked with observability tools in the last decade, you have likely managed, and been burnt by, a fragmented collection of tools and libraries. Each observability signal required its own tool, data formats were incompatible and had little or no correlation. For example, log records would not link to traces, meaning you had to guess which traces led to which events. The OpenTelemetry Protocol (OTLP) solves this by decoupling how telemetry is generated from where it is analyzed.

Read Post

SigNoz

Read more about What is OTLP and How It Works Behind the Scenes

How to Monitor Network Performance for Call Centers (Remote & On-Site)

Jan 5, 2026 By Andrii Kernitskyi In Obkio

A customer calls to place an urgent order. Your agent's VoIP line cuts out mid-sentence. Is it their home connection? Your network? The ISP? The phone system? You have no visibility, and by the time you figure it out, the customer's gone. This is the reality for modern call centers. Whether your agents work from a central office, from home, or split between both. Network issues don't just slow operations; they destroy customer experiences in real-time.

Read Post

Obkio

Read more about How to Monitor Network Performance for Call Centers (Remote & On-Site)

From Zero Tickets to High-ROI: AI + DEX in 2026 (w/ Samuele Gantner and Vedant Sampath)

Jan 5, 2026 By Nexthink In Nexthink

Kicking off 2026, Tim and Tom welcome Nexthink Chief Product Officer Samuele Gantner and first-time guest CTO Vedant Sampath for a candid “three pillars” deep-dive on enterprise AI. They explore how AI is reshaping product and engineering: new tooling, new development cycles, and the shift from deterministic software to probabilistic agents—plus the critical role of evals, benchmarks, guardrails, and performance. Then they unpack Nexthink’s three-pillar framework.

View Video

Nexthink

Read more about From Zero Tickets to High-ROI: AI + DEX in 2026 (w/ Samuele Gantner and Vedant Sampath)

2026 observability trends and predictions from Grafana Labs: unified, intelligent, and open

Jan 5, 2026 By Maggie Cornejo In Grafana

After a decade of dashboards, alerts, and ever-expanding telemetry pipelines, observability is changing. No longer just the domain of engineering, the most innovative organizations are extending observability to all areas of the business to better understand system behavior, emerging risks, and customer impact. At the same time, rising cloud costs and increasing complexity are forcing organizations to be more intentional about what they observe and why.

Read Post

Grafana

Read more about 2026 observability trends and predictions from Grafana Labs: unified, intelligent, and open

2026 Observability & AI Outlook for IT Leaders

Jan 5, 2026 By LogicMonitor In LogicMonitor

IT operations have outgrown the model they were built on. Enterprises now monitor tens of thousands of metrics, ingest terabytes of logs, and generate thousands of alerts daily, all while managing increasingly complex infrastructures that span on-prem data centers, multiple cloud environments, and emerging AI workloads. Yet despite all this telemetry, too many teams still learn about outages from customers before they see them in their tools.

Read Post

LogicMonitor

Read more about 2026 Observability & AI Outlook for IT Leaders

Website Monitoring: What, Why, and Best Practices

Jan 5, 2026 By Dotcom-Monitor In Dotcom-Monitor

In modern times where digital presence dictates business success, understanding website monitoring is no longer optional, whether you run an e-commerce store, SaaS platform, or enterprise website it’s a fundamental pillar of modern operations. Even a few minutes of website downtime can result in lost revenue, damaged credibility, and frustrated users.

Read Post

Dotcom-Monitor

Read more about Website Monitoring: What, Why, and Best Practices

Your Opsgenie Migration is the Path to Proactive Reliability

Jan 5, 2026 By solarwindsinc In SolarWinds

With the Opsgenie end-of-life deadline (April 5, 2027) fast approaching, you're facing a critical choice: Do you truly need to move your dedicated Incident Response workflow into the complexity of Jira Service Management (JSM) or Compass? If your current process is a reactive treadmill—plagued by alert fatigue, lost context, and constant non-critical paging—the mandated move risks replacing one chaotic toolset with another complex ITSM solution. View this not as a burden, but as a chance to build a standardized, human-centric workflow that solves your biggest pain points and transforms your response from chaos to control.

View Video

SolarWinds

Read more about Your Opsgenie Migration is the Path to Proactive Reliability

Troubleshoot faster with the GitLab Source Code integration in Datadog

Jan 5, 2026 By Eric Metaj In Datadog

Developers and SREs who rely on GitLab to develop their services often face significant friction when troubleshooting errors or fixing issues that degrade code quality. To understand the context of a problem, they resort to tab-hopping between observability tools and GitLab, connecting stack traces, spans, and profiles back to the right files and commits.

Read Post

Datadog

Read more about Troubleshoot faster with the GitLab Source Code integration in Datadog

Why Does Observability Feel so Expensive? (Because it Is)

Jan 5, 2026 By Rox Williams In Honeycomb

If you’ve ever stared at your observability bill and whispered “there’s no way this is real,” congratulations! Your instincts are working.

Read Post

Honeycomb

Read more about Why Does Observability Feel so Expensive? (Because it Is)

Check out features we announced at AWS re:Invent in the latest episode of This Month in Datadog

Jan 5, 2026 By Datadog In Datadog

Tune in for spotlights of Bits AI SRE, now generally available, and Datadog’s MCP Server, which connects AI agents to our platform by ingesting prompts and mapping them to Datadog resources and data. Plus, we cover how to: Search logs at petabyte scale in your own infrastructure with CloudPrem Break down costs drivers at the prefix level with Storage Management Create workflows that adapt to real-world complexity with Agent Builder Detect and block credential leaks with Secret Scanning.

View Video

Datadog

Read more about Check out features we announced at AWS re:Invent in the latest episode of This Month in Datadog

Tech Talk - Data Fabric: Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Jan 4, 2026 By Splunk In Splunk

Join this Tech Talk to learn more about.

View Video

Splunk

Read more about Tech Talk - Data Fabric: Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Office 365 Synthetic Monitoring for Availability & SLA Validation

Jan 3, 2026 By Dotcom-Monitor In Dotcom-Monitor

Microsoft Office 365 underpins daily work for millions of organizations. Email, collaboration, document sharing, identity, and meetings all converge into a single dependency that employees implicitly assume will “just work.” When it doesn’t, productivity halts immediately and visibly. Microsoft publishes service health dashboards and backs Office 365 with formal SLAs. On paper, availability is measured, tracked, and contractually enforced.

Read Post

Dotcom-Monitor

Read more about Office 365 Synthetic Monitoring for Availability & SLA Validation

Sentry Claude Code Plugin and Skills

Jan 3, 2026 By Sentry In Sentry

We're looking at what building a standard set of tools to use in Claude Code looks like for Sentry - and Claude's plugin marketplace is a great way to distribute them. Take it for a spin - the plugin distributes the MCP server, a few skills for setting up core parts of Sentry, and a few commands to use.

View Video

Sentry

Read more about Sentry Claude Code Plugin and Skills

Using Sentry's MCP in Vercel v0

Jan 3, 2026 By Sentry In Sentry

You can use the Sentry MCP server to debug issues in applications that are built in v0. You can also use it to create projects, pull down configurations, or dig into performance issues in applications. Check it out.

View Video

Sentry

Read more about Using Sentry's MCP in Vercel v0

Guided DNS Troubleshooting, Full History, and UI Improvements

Jan 3, 2026 By Matt Rideout In DNS Check

We're kicking off 2026 by shipping improvements that make DNS troubleshooting faster and more intuitive. When a DNS check fails, you'll now see exactly what needs to change to fix it, and you'll have complete visibility into your DNS record behavior over time.

Read Post

DNS Check

Read more about Guided DNS Troubleshooting, Full History, and UI Improvements

New Relic vs Sentry - Which Monitoring Tool to Choose? [2026]

Jan 2, 2026 By Pavithra Parthiban In Atatus

New Relic and Sentry are both popular monitoring tools but they're built for very different problems. If you put them side by side and expect a fair fight, you'll quickly find they don't really compete on the same ground. Sentry is built for developers who want to know exactly what broke, where, and why. It's precise, code-first, and excellent at error tracking.

Read Post

Atatus

Read more about New Relic vs Sentry - Which Monitoring Tool to Choose? [2026]

How Grafana Mimir Cut Costs 25%: Kafka and WarpStream at Massive Scale | Big Tent S3E3

Jan 2, 2026 By Grafana In Grafana

Big Tent hosts Mat Ryer and Tom Wilkie talk with Marco Pracucci (Grafana Labs), Cyril Tovena (Grafana Labs), and Ryan Worl (WarpStream/Confluent) about building Sigyn (the internal code name for Mimir’s next-gen architecture), public, open source, and designed for lower TCO and stronger reliability. They cover gapless consumption, predictable partitioning, new “block builder” components, and the practical realities of migrating “mid-flight.”

View Video

Grafana

Read more about How Grafana Mimir Cut Costs 25%: Kafka and WarpStream at Massive Scale | Big Tent S3E3

Top Datadog Competitors and Alternatives in 2026

Jan 2, 2026 By Aiswarya S In Atatus

Datadog is widely recognized for its comprehensive range of products and tools, making it quite a challenge to find a suitable alternative. When seeking an alternative to Datadog, it's essential to conduct a thorough comparison of features, performance, limitations, and other vital aspects. This task requires a deep dive into the details, and it might not be as straightforward as it seems at first glance.

Read Post

Atatus

Read more about Top Datadog Competitors and Alternatives in 2026

EP #3: Cloud, Kubernetes, and the Evolution of DevOps - The Open Source Observability Podcast

Jan 2, 2026 By Coroot In Coroot

Kris Buytaert is the Co-founder of Inuits, O11y, and ‘DevOps Days,’ an internationally-attended series of DevOps events. He is a passionate advocate of Free and Open Source Software, and is accredited by the community as being a founding instigator of the DevOps movement. In this episode we trace the history of the DevOps movement from its intersection with open source and Agile, through the evolution of Cloud technologies and tools such Docker and Kubernetes, to present day best practices for CI/CD, monitoring, and observability.

View Video

Coroot

Read more about EP #3: Cloud, Kubernetes, and the Evolution of DevOps - The Open Source Observability Podcast

Observability vs. Monitoring

Jan 2, 2026 By Coroot In Coroot

Co-founder of DevOps Days, O11y, Inuits, and pivotal instigator of the DevOps movement Kris Buytaert explains why “observability best practices” starts with functioning monitoring and common mistakes to avoid.

View Video

Coroot

Read more about Observability vs. Monitoring

Podman vs Docker 2026: Security, Performance & Which to Choose

Jan 2, 2026 By Anjali Udasi In Last9

When it comes to containerization technologies, Podman and Docker are the two giants that often come up in conversation. Both have revolutionized how we build, deploy, and manage containers, but what sets them apart? In this blog, we'll dive deep into a side-by-side comparison of Podman and Docker. We'll cover everything from architecture to security, performance, and compatibility.

Read Post

Last9

Read more about Podman vs Docker 2026: Security, Performance & Which to Choose

VPN Connection Monitoring: Performance & Availability

Jan 2, 2026 By Dotcom-Monitor In Dotcom-Monitor

For a growing number of organizations, the VPN is no longer a peripheral security control. It is the network. Remote employees authenticate through it. Contractors reach internal tools through it. Administrators access cloud consoles through it. Entire application stacks depend on encrypted tunnels to function at all. When VPN connectivity degrades, productivity collapses quietly and unevenly—often without a clear signal pointing to the root cause.

Read Post

Dotcom-Monitor

Read more about VPN Connection Monitoring: Performance & Availability

How to Choose the Best Website Monitoring Tool for Your Company

Jan 2, 2026 By Dotcom-Monitor In Dotcom-Monitor

Selecting the right website monitoring solution is a critical business decision that directly impacts your operational resilience, customer satisfaction, and bottom line. Downtime, slow load times, or broken user journeys can lead to lost revenue, damaged brand trust, and poor search engine rankings. That’s why website monitoring is no longer optional, it’s a strategic necessity.

Read Post

Dotcom-Monitor

Read more about How to Choose the Best Website Monitoring Tool for Your Company

Transaction Check Best Practices

Jan 2, 2026 By Uptime Website Monitoring In uptime

Welcome back to Uptime.com! In this video, we explore best practices for configuring Transaction Checks to simulate user actions on your website. Learn how to build reliable scripts with a series of commands and validators using our no-code Transaction Check Recorder or your developer tools. We cover essential tips like keeping checks streamlined, using 'wait for' commands to ensure element readiness, and validating URL transitions. Follow along as we set up a simple 7-step check to validate the Uptime.com domain health tool.

View Video

uptime

Monitoring

Read more about Transaction Check Best Practices

Performing Real-Time Anomaly Detection with InfluxDB 3: An In-Depth Guide

Jan 2, 2026 By Suyash Joshi In InfluxData

If you’re working with sensors, machines, or embedded systems, your primary goal is simple: no unplanned downtime and smooth operations. This means detecting errors and taking action as soon as possible, ideally preventing them through predictive maintenance before they become critical issues.

Read Post

InfluxData

Read more about Performing Real-Time Anomaly Detection with InfluxDB 3: An In-Depth Guide

Mimir's next-gen architecture-Kafka in the middle, object storage underneath, and a whole lot less coupling

Jan 2, 2026 By Alexa Becker In Grafana

Sometimes the most important engineering work starts with a deceptively simple question. Not “What’s the best dashboard layout?” or “How many Ts are in Matt?” (still contested), but something much more fundamental: What if the read path and the write path didn’t have to share the same fate?

Read Post

Grafana

Read more about Mimir's next-gen architecture-Kafka in the middle, object storage underneath, and a whole lot less coupling

Datadog Pricing 2026: Full Cost Breakdown + How to Save 40-90%

Jan 2, 2026 By Anjali Udasi In Last9

When it comes to monitoring and observability tools, Datadog is often one of the first names that comes to mind. But while Datadog’s features are widely discussed, its pricing often remains a topic of confusion. How much does Datadog cost, and what factors influence your bill? This guide breaks down Datadog pricing to help you better understand its structure, hidden nuances, and whether it’s the right fit for your needs.

Read Post

Last9

Read more about Datadog Pricing 2026: Full Cost Breakdown + How to Save 40-90%

Online IQ Testing as a Digital Measurement System

Jan 2, 2026 By OpsMatters In OpsMatters

Online cognitive testing has moved far beyond casual quizzes. Today, an online IQ test is a structured digital system that collects inputs, processes data, and produces a measurable output - a score intended to reflect cognitive ability. From an operations perspective, this makes IQ testing surprisingly similar to any modern measurement pipeline: inputs, validation, processing, monitoring, and reporting.

Read Post

OpsMatters

Read more about Online IQ Testing as a Digital Measurement System

Top tips: How small IT organizations can save big on development costs

Jan 1, 2026 By Eric Roshaan In ManageEngine

Top tips is a weekly column where we highlight what’s trending in the tech world and share ways to stay ahead. This week, we’re taking a closer look at how smaller IT teams can keep their development costs under control—without sacrificing quality or long-term viability. When you're a large IT enterprise providing services to millions of users around the world, it's only natural to expect development costs to be sky high.

Read Post

ManageEngine

Read more about Top tips: How small IT organizations can save big on development costs

What's New at Logz.io - January 2026

Jan 1, 2026 By Amos Etzion In logz.io

We’ve updated Logs Explore to integrate the real-time streaming capabilities of the old “LiveTail” into our new Explore environment. The result? A faster and more seamless experience. Customers can now benefit from.

Read Post

logz.io

Read more about What's New at Logz.io - January 2026

Website Performance Monitoring, Change Detection, and Alerts: What You Should Know

Jan 1, 2026 By Dotcom-Monitor In Dotcom-Monitor

A business website isn’t just an online presence; it’s the virtual front door to your business, brand, or service. If the door remains locked, opens slowly, or undergoes unexpected changes, you run the risk of losing visitors, customers, and revenue. That’s where comprehensive website monitoring becomes essential. Modern web monitoring goes far beyond simple uptime checks.

Read Post

Dotcom-Monitor

Read more about Website Performance Monitoring, Change Detection, and Alerts: What You Should Know

Operations | Monitoring | ITSM | DevOps | Cloud