Operations | Monitoring | ITSM | DevOps | Cloud

September 2018

Alert fatigue, part 2: alert reduction with Sensu filters & token substitution

In my previous post, I talked about the real costs of alert fatigue — the toll it can take on your engineers as well as your business — and some suggestions for rethinking alerting. In part 2 of this series, I’ll share some best practices for fine-tuning Sensu to help reduce alert fatigue.

Alert fatigue, part 1: avoidance and course correction

Alert fatigue occurs when one is exposed to a large number of frequent alarms (alerts) and consequently becomes desensitized to them. This problem is not specific to technology fields: most jobs that require on-call, such as doctors, experience it in slightly different manners, but the problem is the same.

Building + testing open source monitoring tools

At Monitorama 2018, I shared some of the cool process and knowledge I’ve learned from developing a product for people other than myself to consume. After spending six years on call, I now build software that wakes people up in the night — AKA, infrastructure and tooling for systems monitoring and performance analysis. As someone who’s been there, I’m conscientious about building quality software that people delight in using.

Using NGINX for targeted access to the Sensu Core 1.4 API

NGINX can be used as a proxy to provide authenticated access to specific endpoints for any RESTful service API — including the Sensu API. Below I provide an NGINX configuration to grant external service provides narrow access to only create check results in the Sensu 1.4 API external service providers. But first, here's some backstory of how I got here.

Puppet Sponsor Session at Sensu Summit 2018

In this talk from Sensu Summit 2018, Garrett Honeycutt showcases the Puppet module: its current state; support for Sensu 2.0; highlight community contributions and how you can contribute. You’ll see the Vagrant setup and how even if you don’t use Puppet, you can easily get Sensu up and running on a bunch of different platforms.

Project 3M: Meaningful Monitoring and Messaging at Sensu Summit 2018

In this talk from Sensu Summit 2018, Christopher J. Caillouet, Senior Dev|Ops Production Engineer at Industrial Light & Magic, looks behind the curtain and sees how the intelligence and uptime they gain by leveraging Sensu in the ILM monitoring infrastructure enables reliability and stable delivery within a large scale and geographically distributed set of datacenters.

Pull, don't push: Architectures for monitoring & config in a microservices era at Sensu Summit 2018

In this Sensu Summit 2018 talk, Chef's Julian Dunn & Fletcher Nichol give you a primer about promise theory and the autonomous actor model that underlies the design of products like Sensu and Habitat, why it leads to not only higher overall system reliability but human comprehension for easier operations.