Operations | Monitoring | ITSM | DevOps | Cloud

MCP Server Integration & Much More: What's New in VictoriaMetrics Cloud Q2 2025

Q2 2025 has brought another wave of improvements to VictoriaMetrics Cloud! If you tuned in to our latest Quarterly Virtual Meetup, you saw firsthand how we’re making observability even more accessible, powerful, and interactive.

Top Kubernetes Monitoring Tools in 2025, And Why Alerting Is Critical for DevOps and SRE Teams

What are the best Kubernetes monitoring tools in 2025? And how can you ensure alerts actually drive action when something goes wrong? Kubernetes monitoring is critical for keeping your containerized applications healthy, but alerting is often overlooked. This blog compares popular tools like Prometheus and Datadog and explains why intelligent alerting solutions like OnPage are essential for effective incident response.

What is a Jitter Buffer and How It Works

If you've ever been on a choppy VoIP call or sat through a video meeting where people sounded like robots from the ‘90s, you’ve likely run into a little thing called jitter. It’s one of those sneaky network issues that doesn’t always get the attention it deserves, until it ruins your real-time traffic. As IT pros and network admins, you're probably used to dealing with packet loss and latency. But jitter? That one's a bit trickier.

A Detailed Look at Calico Cloud Free Tier

As Kubernetes environments grow in scale and complexity, platform teams face increasing pressure to secure workloads without slowing down application delivery. But managing and enforcing network policies in Kubernetes is notoriously difficult—especially when visibility into pod-to-pod communication is limited or nonexistent. Teams are often forced to rely on manual traffic inspection, standalone logs, or trial-and-error policy changes, increasing the risk of misconfiguration and service disruption.

Robust Time Series Monitoring: Anomaly Detection Using Matrix Profile and Prophet

Monitoring production systems often feels like searching for a moving needle in a constantly shifting haystack. At Sentry, our goal was to empower customers to move beyond traditional threshold and percentage-based alerting. We aimed to help them detect subtle and complex anomalies in their systems in near real-time. This post will detail how our AI/ML team developed a time series anomaly detection system using Matrix Profile and Meta’s Prophet.

Dynamic Status Pages on Demand

Clients expect transparency - especially when things go wrong. But manually updating a status page during an incident or maintenance window slows you down when speed matters most. Oh Dear’s status pages are more than just a pretty uptime dashboard. They’re fully API-driven and designed to scale with your workflow. Whether you manage five client sites or five hundred, you can create, update and sync status pages as needed. Here’s how to do it.

How we built agentic incident response

‍ AI already transforms how we detect, respond to, and resolve outages. Traditional workflows often force responders to switch between dashboards, shift through logs, and coordinate across fragmented channels under stress. This reactive, manual approach leads to slower resolution, higher operational costs, and burnout, especially as IT systems grow more complex. ‍ At ilert, we are not just discussing the future of incident management – we are actively building it.

10 Essential Things to Know Before Diving into Database DevOps

In today’s rapidly evolving world of development, Database DevOps is becoming an essential practice. It combines the agility of DevOps with the intricacies of databases, all with the goal of enhancing speed, stability and collaboration when it comes to database changes. However, before diving into Database DevOps there are some key concepts your team should get acquainted with. Here are 10 key things you should know to truly understand and reap the benefits from Database DevOps.

From Detection to Action: Elevating Microsoft Sentinel with SIGNL4 Mobile Alerting

It’s 2:13 a.m. Your Microsoft Sentinel instance has flagged a high-severity alert – potential lateral movement detected across several endpoints. But the on-call analyst is fast asleep. The alert was sent… via email. By the time someone notices, hours have passed. The threat? It’s already spread. In modern security operations, detection is only half the battle. The other half? Making sure the right human sees the alert – and acts on it – in time.