Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Cribl Search Pack for Missing Logs

Ever run a SIEM search only to see nothing for your firewall logs? In this video, we show a smarter way to detect when log sources stop sending data using Cribl Lake, Cribl Search, and Cribl Stream. Learn how to track “last seen” times, build efficient aggregations, and get real-time alerts—without burning SIEM resources or storage.

What's New in VictoriaMetrics Cloud Q3 2025 - Cloud Database

Join Marc Sherwood and Jose Gomez-Selles as they unveil the significant updates to VictoriaMetrics Cloud from Q3 2025 and share a glimpse into the exciting roadmap for what's coming next! This session is packed with new features designed to make your monitoring experience more robust, user-friendly, and cost-effective. In this video, you'll discover: Expansion to Asia! VictoriaMetrics Cloud now has a brand new region on AWS ap-southeast-1 (Singapore) in Asia Pacific, bringing lower latency and regional data sovereignty closer to your teams and deployments.

How to Monitor Network Performance for Multi-Site Businesses

When you’re a business managing network performance across 15 branch offices in different cities, you’re going to see some blind spots. Your headquarters may experience consistent connectivity, while remote location experience unpredictable slowdowns that can affect your daily operations.

Easy Guide for Connecting Redis to a Grafana Data Source

Redis is a widely used in-memory data store, commonly deployed as a cache, session store, message broker, or fast key-value database. Because Redis often sits on the critical path of an application, having visibility into its behavior (memory usage, client connections, command throughput, cache efficiency) is essential for troubleshooting and performance tuning.

Automate flaky test fixes with the Bits AI Dev Agent and Test Optimization

Flaky tests are a significant source of inefficiency that impacts many engineering teams. Along with failing your build, they interrupt your entire development flow, generate excessive CI/CD noise, and, critically, compromise developer trust in the test suite itself. Datadog Test Optimization enables you to manage test suites at scale by pinpointing the flakiest tests, analyzing their history across hundreds of runs, and automatically surfacing the root cause.

How we built an AI SRE agent that investigates like a team of engineers

We built Bits AI SRE to help engineers investigate and solve production incidents, one of the most difficult aspects of operating distributed systems today. As environments grow more dynamic and complex, resolving issues becomes more challenging. Failures now span more services, involve noisier signals, and encompass larger volumes of telemetry data, making it hard for on-call engineers to find root causes quickly. Today, Bits AI SRE is already helping teams decrease time to resolution by up to 95%.

Heroku Monitoring Add-ons 2026 and Hosted Graphite

Monitoring performance of Heroku applications helps improve user experience. This blog post covers Heroku monitoring add-ons and explores why Hosted Graphite is the best choice in 2026. We'll discuss the benefits and setup process of the Hosted Graphite add-on. We'll also discuss future trends in Heroku monitoring.

Intercom outage - January 9th, 2026

Ever had that sinking feeling when your help desk just stops responding, but the official status page says everything is “up and running”? That’s exactly what happened on January 9, 2026, when Intercom – one of the world’s most popular support tools – hit a major snag. While hundreds of companies were left staring at loading circles, StatusGator was already on the case.