Operations | Monitoring | ITSM | DevOps | Cloud

Aligning Business and Engineering Goals with Honeycomb SLOs

Setting clear, measurable goals is essential for any successful team. However, aligning those goals with the technical work can be challenging in the fast-paced world of software engineering. Engineers might focus on reducing latency or improving uptime, while business leaders look at revenue and customer satisfaction. It gets tricky to track the impact between the two to justify when specific engineering initiatives are important, why, and how they impact the bottom line.

A CoPE's Guide to Alert Management

Alerts are a perennial topic, and a CoPE will need to engage with them. The bounds of this problem space are formed by two types of alerts: Understanding what these alerts are and how to configure them is one thing. Thinking about what they each do for your organization, and how using one or the other affects things, is another. The latter will be the focus of this article.

The CoPE and Other Teams, Part 2: Custom Instrumentation and Telemetry Pipelines

The previous post laid out the basic idea of instrumentation and how OpenTelemetry’s auto-instrumentation can get teams started. However, you can’t rely only on auto-instrumentation. This post will discuss the limitations in more detail and how a CoPE can help teams overcome them.

Apdex in Honeycomb

“How is my app performing?” is one of the most common, yet hardest questions to answer. There are myriad ways to measure this, like error rate, average response time, and so on. Enter the Application Performance Index (aka Apdex), a single metric that attempts to answer, “Are my application’s users happy?” Apdex is an open standard that was formalized in 2005 by the Apdex Alliance.

See Your Structured Logs in the Explore Data tab

There's a new way to flip through your data in Honeycomb, released this week! It's super for looking at structured logs. It's called: Explore Data. Get directly at the logs, spans, events, or metrics that power the fast analysis you can do with Honeycomb. See all the fields, the whole variety of values — now ordered by timestamp, with pagination. Modify your query and graphs right from the data table. It's all connected!

Making Room for Some Lint

It’s one of my strongly held beliefs that errors are constructed, not discovered. However we frame an incident’s causes, contributing factors, and context ends up influencing the shape of the corrective items (if any) that get created. I’ll cover these ideas by using our June 3rd incident where a database migration caused a large outage by locking up a shared database and making it run out of connections.

The CoPE and Other Teams, Part 1: Introduction & Auto-Instrumentation

The CoPE is made to affect, meaning change, how things work. The disruption it produces is a feature, not a bug. That disruption pushes things away from a locally optimal, comfortable state that generates diminishing returns. It sets things on a course of exploration to find new terrains which may benefit it more—and for longer.