Operations | Monitoring | ITSM | DevOps | Cloud

Anomaly Detection


Extending Netdata's anomaly detection training window

We have been busy at work under the hood of the Netdata agent to introduce new capabilities that let you extend the "training window" used by Netdata's native anomaly detection capabilities. This blog post will discuss one of these improvements to help you reduce "false positives" by essentially extending the training window by using the new (beautifully named) number of models per dimension configuration parameter.


What are AIOps use cases?

The past decade has seen organizations embrace AI and data analytics at scale. In 2022, IBM found that 35% of organizations have embraced AI—a 4% increase from 2021. The trend of AI adoption will continue to play out in the next several years across virtually every organizational function. At the vanguard of this movement is AIOps, which sees AI used to improve IT operations (ITOps).

Sponsored Post

Using AIOps for Better Adaptive Incident Management

An effective incident management strategy is crucial for any business, especially those offering consumer-facing digital services. This is because when incidents occur, they may be easily detected by your users, impact your reputation, and ultimately affect your bottom line. So, to minimize the reach and severity of incidents, your response needs to be swift and effective. One way to ensure your approach meets these requirements is to implement AIOps.


Introducing Outlier Detection in Grafana Machine Learning for Grafana Cloud

Outlier Detection is now available as part of the Grafana Machine Learning toolkit in Grafana Cloud for Pro and Advanced users. With this feature, you can monitor a group of similar things, such as load-balanced pods in Kubernetes, and get alerted when some of them start behaving differently than their peers. There’s supposed to be a video here, but for some reason there isn’t. Either we entered the id wrong (oops!), or Vimeo is down.


How to Detect Anomalies and Why You Should Care

Companies today are relying on technology more than ever thanks to widespread digital transformation and cloud initiatives. And this is increasing the need for safe, efficient and reliable IT environments. But maintaining operational IT stability is very difficult when considering the complex and dynamic nature of today’s IT environments. In fact, IT environments are constantly changing, with new network devices, users and software versions coming into existence.


Automate Anomaly Detection for Time Series Data

This article was originally published in The New Stack and is reposted here with permission. Hundreds of billions of sensors produce vast amounts of time series data every day. The sheer volume of data that companies collect makes it challenging to analyze and glean insights. Machine learning drastically accelerates time series data analysis so that companies can understand and act on their time series data to drive significant innovation and improvements.


Anomaly Detection and AIOps - Your On-Call Assistant for Intelligent Alerting and Root Cause Analysis

In this blog, we examine how anomaly detection helps by setting up healthy alerts and providing efficient root cause analysis. Anomaly detection, part of AIOps, guides your attention to the places and times where remarkable things occurred. It reduces information overload, thereby speeding up RCA investigation.


Expedite infrastructure investigations with Kubernetes Anomalies

Modern Kubernetes environments are becoming increasingly complex. In 2021, Datadog analyzed real-world usage data from more than 1.5 billion containers and found that the average number of pods per organization had doubled over the course of two years. Organizations running containers also tend to deploy more monitors than companies that don’t leverage containers, pointing to the increased need for monitoring in these environments.


Common Anomaly Detection Challenges & How To Solve Them

Anomaly detection can be defined by data points or events that deviate away from its normal behavior. If you think of this in the context of time-series continuous datasets, the normal or expected value is going to be the baseline, and the limits around it represent the tolerance associated with the variance. If a new value deviates above or below these limits, then that data point can be considered anomalous.