Operations | Monitoring | ITSM | DevOps | Cloud

April 2020

Circonus Spring 2020 Release Includes Kubernetes Monitoring Solution

This week, we announced the availability of our Spring 2020 release. The highlight of the release is our Kubernetes monitoring solution, which provides health-based alerting and horizontal pod auto-scaling. Additional enhancements include cloud monitoring, GCP Marketplace availability, performance improvements, and a more comprehensive Terraform integration. Here’s some background on these latest capabilities.

Monitoring Latency SLOs with Histograms and CAQL

Latency SLOs help us quantify the performance of an API endpoint over a period of time. A typical latency SLO reads as follows: The proportion of valid* requests served over the last 4 weeks that were slower than 100ms is less than 1%. *In this context, “valid” means that the request responded with a status code in the 200s.

Using CAQL to Identify Hosts with Top CPU Usage

A common task that users want to perform when monitoring their infrastructure is to identify their top resource consumers. Although the following techniques can be applied to numerous different resource metrics, we will specifically look at the problem of identifying which of our hosts or services are consuming the most CPU resources.