Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Painting a Complete Network Monitoring Picture: Why Context is Critical

In order to produce their masterpieces, artists like van Gough, Rembrandt, Picasso, and Monet painted with more than just one color. Being able to choose from multiple colors (not to mention an abundance of talent, inspiration, and creativity) is what allowed these artists to see their complete vision come to life on canvas. However, if you’re relying on a single set of data to troubleshoot network issues, it’s like you’re stuck painting with one color.

5 Best IT Experience Practices Your Team Can Make Today

If you were to put 100 enterprise tech leaders in a room together and ask them if they think their company’s employee experience is dependent upon IT, I’m certain all would agree it is. But I’m also certain those 100 wouldn’t know: For IT decision-makers, the devil is in the details. Many are judged by uncompromising Service Level Agreements (SLAs) and shoddy survey data, not comprehensive digital experience trends and indexes.

CheckMK and Enterprise Alert - a scripted heartbeat check

A few days ago I received an inquiry about a scripting problem from one of our longtime partners, to be exact our DCP Marc Handel from IT unlimited AG. In the exchange with Marc I realized that his idea to use the Enterprise Alert Scripting Host, the Windows Task Scheduler and CheckMK to realize a roundtrip monitoring could be interesting for the whole community. Especially for all our CheckMK customers.

Cloud or On-Prem? With Monitoring, It's Both-And, Not Either-Or

Despite the migration of services and systems to cloud (either all or in part), many of the fundamental aspects of the day-to-day work IT practitioners do hasn’t changed. It’s just moved. In this session, SolarWinds Head Geek Leon Adato and Technical Content Manager for Community Kevin M. Sparenberg discuss that state of affairs, as well as what monitoring can do to help view those resources as a contiguous whole, despite possibly being split across the on-prem/cloud divide.

Introducing the Lightstep Metrics plugin for Grafana

Chris Sackes is a Software Engineer at Lightstep. A New Yorker by birth, he loves public transportation, architecture photography, and urban exploration. He’s spent the last five years engineering delightful user experiences for a variety of applications. Lightstep’s powerful metrics reporting and analysis are now available for Grafana users. Using the new Lightstep Metrics plugin for Grafana, you can view metrics data reported to Lightstep directly in your Grafana instance.

Monitoring Amazon cloudfront with Graphite via Graphite APIs

MetricFire offers a complete system, infrastructure, and application monitoring using a suite of open-source monitoring tools. With MetricFire, you can monitor all your infrastructure on a single dashboard. The platform displays metrics on the dashboard using either Hosted Prometheus or Graphite-as-a-Service.

How Lowe's SRE reduced its mean time to recovery (MTTR) by over 80 percent

The stakes of managing Lowes.com have never been higher, and that means spotting, troubleshooting and recovering from incidents as quickly as possible, so that customers can continue to do business on our site. To do that, it’s crucial to have solid incident engineering practices in place. Resolving an incident means mitigating the impact and/or restoring the service to its previous condition.