Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Access your data with Federated Analytics for Amazon Security Lake. Insights from Splunk, AWS, and A

Federated Analytics gives organizations the full power of Splunk extended to data stored in Amazon Security Lake. Trusted partners like Accenture are helping bring these new capabilities to life at organizations around the world.

What Is Network Device Monitoring? Find Out 5 Top Monitoring Tools

Businesses, organizations, and individuals rely on networks to communicate and exchange data. The rapid growth of technology and increasing reliance on networked systems have made robust network performance and security critical. However, maintaining optimal network performance and security is a difficult task. Network failures, security breaches, and performance bottlenecks can result in substantial financial losses and reputational damage. What Is Network Device Monitoring?

Top 5 EdTech outages detected by StatusGator in January 2025

Educational platforms are essential for students, educators, and institutions, making service disruptions especially impactful. StatusGator’s early detection ensures that users receive timely alerts before official acknowledgments, helping them navigate unexpected downtime. Below, we recap significant education-related outages from January 2025, where StatusGator kept users ahead of disruptions.

How AI-powered anomaly detection is transforming APM for SREs

Site reliability engineers (SREs) often face challenges in keeping an organization’s sites running smoothly as the complexity of distributed systems steadily increases. With the rise of microservices, cloud-native architectures, and massive data volumes, manual monitoring and troubleshooting are no longer sustainable. SREs must navigate hurdles like alert fatigue, incident response delays, and the constant pressure to maintain system reliability.

Getting Started with M365 dashboards

SquaredUp is a flexible dashboard and analytics platform that makes it really easy to dashboard your M365 and Intune usage and analytics. You can then use it for monitoring or sharing! In this article we’ll take a look at getting started with the M365 plugin for SquaredUp and building our first dashboard. Sign up for a free account if you’d like to follow along.

Petabyte Scale, Gigabyte Costs: Mezmo's Evolution from ElasticSearch to Quickwit

At Mezmo, we handle an enormous volume of telemetry data for our customers and ourselves, requiring a robust and efficient search and analytics backend. For years, ElasticSearch served us well, but as our infrastructure grew to a multi-cluster, multi-petabyte scale, we started to see the cracks—rising costs, performance bottlenecks, and scalability concerns. We needed a change, one that would make our system more cost-effective while maintaining speed and reliability.

Kubernetes Monitoring and Alerting Made Easy with Splunk Observability Cloud and OpenTelemetry

In this video, I'll show you how to quickly setup monitoring and alerting for your Kubernetes clusters using Splunk Observability Cloud. We’ll start by deploying the Splunk OpenTelemetry Collector using Helm, and then use the Kubernetes Navigator inside Splunk Observability Cloud to view the health of our cluster and the applications it’s hosting. I’ll demonstrate AutoDetect detectors and alerts by intentionally triggering an issue in the cluster and walk through the alerting process. We’ll review the alerts in Splunk Observability Cloud and then resolve the issue in the cluster.