It’s been a while since the last update on “What’s New” series, so I will try and keep it short yet informative. Stay tuned for upcoming content on anomaly detection.
Consider this: traditional cloud cost monitoring was like checking your fuel gauge once a month — after the trip was already over. That model worked when infrastructure scaled slowly. You provisioned resources predictably and paid for stable, linear usage. AI breaks that model. Today, AI costs behave like a high-performance engine with a hypersensitive throttle. A small input, like a prompt change or a single power user, can dramatically increase your fuel burn in seconds.
Discover the latest advancements in AI-driven monitoring with VictoriaMetrics. Fred Navruzov, Lead of the Anomaly Detection team, presents a comprehensive year-in-review for vmanomaly (part of the VictoriaMetrics Enterprise suite). This session dives into how we are making machine learning more accessible for SREs through new interactive tools and protocol integrations. Key Highlights: 2025 Recap: A look back at the major releases and improvements in vmanomaly. Interactive Playgrounds: A demo of our new environment for testing anomaly detection models before deployment. MCP Server Integration.
“Every engineering decision is a cost decision,” notes Ben Johnson, co-founder and CTO of Obsidian Security. That’s the reality of building modern SaaS products in the cloud. But as Ben points out, the answer isn’t to make engineers think long and hard about every dollar they spend. “You don’t want your team hesitating to solve risky technical problems because a choice might add $100 to the bill.
Modern engineering teams face a persistent challenge: knowing when something goes wrong before their customers do. With microservices architectures sprawling across dozens or hundreds of services, creating comprehensive alerting becomes an overwhelming task. You're left playing whack-a-mole with manual alert configurations, often missing critical issues or drowning in false positives.
Modern teams face a persistent challenge: knowing when something goes wrong before their customers do. With architectures sprawling across dozens or hundreds of services, creating comprehensive alerting becomes an overwhelming task. You're left playing whack-a-mole with manual alert configurations, often missing critical issues or drowning in false positives. Today, we're excited to announce our solution to this challenge: Anomaly Detection (currently in alpha), Honeycomb's proactive approach to understanding and acting on service health.
Anomaly detection goes beyond fixed thresholds to catch the issues your monitoring might miss—like unusual latency spikes, sudden drops in traffic, or odd system behavior that doesn’t throw an error. In this video, we explain: With Site24x7’s AI-powered monitoring, anomaly detection is built-in—helping DevOps teams move from reactive fixes to proactive observability.
In a multi-cloud environment, each cloud platform brings its unique tech stack to record events, manage services, set up configurations, manage user access and permissions, etc. While this allows you to leverage the best-of-breed services from different cloud vendors, the complexity of this setup makes it challenging to detect and respond to anomalies across clouds in real-time.
Monitoring production systems often feels like searching for a moving needle in a constantly shifting haystack. At Sentry, our goal was to empower customers to move beyond traditional threshold and percentage-based alerting. We aimed to help them detect subtle and complex anomalies in their systems in near real-time. This post will detail how our AI/ML team developed a time series anomaly detection system using Matrix Profile and Meta’s Prophet.