Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Observabilty for complex systems and related technologies.

The Five Tenets of Observability

A new year is a chance to have a new start, and one thing that it’s a great opportunity to think about is the monitoring and observability platform you’re using for your applications. If you’ve been using a legacy monitoring system, you’ve probably heard about observability all over the ‘net and want to figure out if this is really something you need to care about.

Make the most of your observability data with the Data Volume app

As a DevOps, SecOps, or IT operations manager, you're surrounded by all the technology for the systems running the entire organization. This means legacy infrastructure, multi-cloud environments, services, tools, and applications. All of these components generate data—a huge amount of data—some of which you need to leverage for full-stack observability to ensure those systems supporting the business are running efficiently.

How Reliability and Product Teams Collaborate at Booking.com

With more than 1.5M room nights booked per day, Booking.com requires a solid infrastructure that’s constantly monitored. And indeed, Booking.com now has a footprint of 50,000+ physical servers running across four data centers and six additional points of presence. The sheer size of this server fleet makes it viable for Booking.com to have dedicated teams specializing into looking only at the reliability of those servers.

The Observability Pipeline

Today’s systems are more distributed, dynamic, and complex than ever before – plus, users have more expectations. Also, the historical reliance on an operations team to monitor, triage, and/or resolve issues has become untenable as the number of services increased. This means that many of the tools that were well-suited before might no longer be adequate.

Ask Miss O11y: Long-Running Requests

You need not fear a long-lived streaming workload. A few simple tricks can transform a request that may not ever terminate for hours or days into something you can get regular health and status updates on. We in fact have one of those continuous processing services—Beagle, our Service Level Objective stream processor—which we’ve instrumented in this fashion.

The Business Case for Observability and Site Reliability Engineering

Unlike traditional IT Ops, the role of the SRE isn’t simply focused on finding and solving technical problems. The big win for today’s SREs is supporting the organization’s strategic innovation initiatives. With the appropriate observability capabilities, it’s possible to quantify the value that software infrastructure contributes to this innovation effort.

Wisdom of the Crowds: The Value of User Sentiment Observability

What’s the first thing most people do when they’re unhappy with a business? Take to social media to complain about it. Observing those comments – otherwise known as “user sentiment observability” – gives you a head’s up as to when problems become big enough to impact user experience. How can you monitor that voice of the customer? And why is it important to do so? Let’s take a deeper look at the issues.

ICYMI: Honeycomb Developer Week: The Partner Ecosystem

We know that you value collaboration. That’s why we share incident reviews and learnings—because we believe the entire community benefits by working together transparently. In the spirit of working better together, we invited ecosystem partners from ApolloGraph, Cloudflare, LaunchDarkly, and PagerDuty to present at Honeycomb Developer Week, a three-day event filled with snackable, time-efficient learning sessions to help you uplevel your observability skills.

Operations Analytics - The Next Big Thing

Cloud Data Warehouses (CDW) were designed to support business intelligence use cases focused on historical data analysis, but less so on “what is happening now?” class of queries. We think operational analytics are the next big focus and we want to discuss the space and how enterprises will connect their operational data to these new tools to get results right now instead of next week.