Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

How Alerting Works in SolarWinds Observability Self-Hosted

This training video from SolarWinds Academy provides a high-level overview of how the alerting process works within SolarWinds software. Technical trainer Cheryl Nomanson explains the step-by-step workflow, starting with the alerting engine continuously scanning the database for conditions that meet alert trigger thresholds. She covers how triggered elements are evaluated for suppressions (like time-of-day restrictions and scoping), and explains that only fully qualified conditions become actual alerts. The video details how alerts always display in the web console and may trigger additional actions like emails or scripts.

How to Create an SNMP Poller in SolarWinds Observability Self-Hosted

SolarWinds technical trainer Cheryl Nomanson presents a systematic approach to optimizing and building custom SNMP pollers. The tutorial walks through a step-by-step process starting with adding devices for SNMP monitoring using default pollers, then identifying missing metrics and checking if the required OIDs exist. If OIDs don't exist, she explains how to use alternative OIDs or data transformation tools.

Networking Technology Trends for 2026

From an IT pro’s perspective, the future of networking technology in 2026 is a mixed bag of potential and security risk. New wireless tech, agentic AI, and the increased distribution of networks are enabling new use cases and helping automate toil, but they also create new attack surfaces and risk profiles. In this article, we’ll take a look at the ten network security trends we’re most excited about in 2026 and provide key insights about what each one means for IT and MSP teams.

Integrating Prometheus Metrics into Icinga Using check_prometheus

This article explains how to integrate metrics from Prometheus into Icinga checks using the check_prometheus plugin. There can be multiple reasons why this could be desired: Maybe you have different teams with their own monitoring systems, and you need to bridge the gap, or you want to perform queries that are just better expressed in Prometheus than in plain Icinga check plugins. The latter can be the case if you want to aggregate data from multiple sources or you want to take historic data into account.

The Incident Checklist: Reducing Cognitive Load When It Matters Most

In the previous post, we looked at what happens after detection; when incidents stop being purely technical problems and become human ones, with cognitive load as the real constraint. This post assumes that context. The question here is simpler and more practical. What actually helps teams think clearly and act well once things are already going wrong? One answer, used quietly but consistently by high-performing teams, is the checklist.

Domain Health Check: Why It Matters and What It Reveals

Your domain is more than a URL- it’s the control plane for how people (and machines) reach your website, apps, and inbox. When something breaks at the domain layer, the symptoms look “random” (site intermittently down, emails bouncing, logins failing), but the root cause is often predictable: misconfigurations, weak authentication, or degraded DNS performance. A domain health check is the fastest way to surface those issues before customers do.

Event context, tags, logs and metrics | Debugging Next.js Applications with Sentry

Adding additional information to issues captured in Sentry can help you identify and prioritize your most critical issues. Logs and Metrics help build context around the error and understand correlation and causation all in one place due to everything being trace connected.

When DIY Becomes a Network Liability

There is a satisfaction in building things yourself. It is the same psychological hook that powers the endless stream of DIY renovation videos on your social media feeds. You watch a sixty-second clip of someone transforming a pile of lumber into a custom coffee table, and it looks ingenious, cost-effective, and uniquely tailored to their needs. It triggers a powerful "why buy when I can build?" mindset.

Stop Sifting Logs: Find Production Errors in Seconds with `severity=error`

Want your log queries to be more precise? Is your vibe code flooding you with logs and need a helping hand to make sense of it all? Good news! We've upgraded our log query language to be more powerful, flexible, and intuitive, letting you focus on finding answers fast rather than endlessly scrolling through your logs. And that's not all: We've revamped our logging interface, making it easier than ever to manage logs, customize views, and leverage log attributes.

Taming Atlassian Audit Logs: Processing messy JSON to enable operational insights

Atlassian’s audit records are data-rich, but messy. In this data-driven deep dive, Eddy Gurney from NetScout shares what it took to get them into Graylog. He walks through four pipeline approaches and why each fell short, then shows how moving parsing to the edge with Filebeat unlocked Graylog. With clean, flattened events flowing in, alerts and dashboards turn “noise” into operational visibility. You’ll also see how Sidecars makes config rollout easy, plus what changes to make if you’re on Atlassian Cloud instead of Data Center.