Operations | Monitoring | ITSM | DevOps | Cloud

Monitoring

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Digging Into the Recent Azure Outage

In the early hours of Wednesday, January 25, Microsoft’s public cloud suffered a major outage that disrupted their cloud-based services and popular applications such as Sharepoint, Teams, and Office 365. Microsoft has since blamed the outage on a flawed router command which took down a significant portion of the cloud’s connectivity beginning at 07:09 UTC.

Grafana documentation: A look at the new and improved design

We recently launched a new design for our technical documentation. The goal of the redesign was to make our technical documentation more accessible, modern, and scalable as we grow. In addition to a new look (hello, new typeface and layout!), our updated docs pages reveal the underlying work our team has done to evolve and enhance our technical documentation.

Profiling: Buzzword or Critical Observability Tool? | Snack of the Week

Profiling may seem like the latest buzzword in the monitoring and observability world, but profiling tools have actually been in use for decades. I’m going to quickly explain what profiling is and why modern profilers are getting so much attention lately.
Sponsored Post

How to Choose the Right IT Ops Metrics

Traditionally, we consider IT to be managing and monitoring on-premises network infrastructure, including hardware and software. However, the reality is that most enterprises have accepted and migrated much of their infrastructure to the cloud already. They recognize the benefits of the cloud and that it is here for the long haul. According to the latest study from Deloitte, 90% of organizations have been using cloud services for the last three years, and 79% are hosting workloads with multiple cloud providers. In addition, adopting cloud computing platforms has accelerated significantly in the remote work era.

Cloud Providers Health Report - January 2023

Check our January 2023 health report on the top most popular cloud providers. We analyze the health of the cloud providers based on the number of outages and problems during the month. The source of the data is made available by the cloud providers themselves via their status page. We normalize it and use it to generate the report.

Extending Netdata's anomaly detection training window

We have been busy at work under the hood of the Netdata agent to introduce new capabilities that let you extend the "training window" used by Netdata's native anomaly detection capabilities. This blog post will discuss one of these improvements to help you reduce "false positives" by essentially extending the training window by using the new (beautifully named) number of models per dimension configuration parameter.