Operations | Monitoring | ITSM | DevOps | Cloud

Teaching AI How to Refinery

At the beginning of February, we released v3.1 of Refinery, our advanced, tail-based sampling solution. The new version comes with more performance enhancements, bug fixes, and a few new pieces of telemetry. In tandem with the 3.1 release, we also released a new tool for our MCP server which helps your AIs understand Refinery, and how Honeycomb handles sampling.

Happy Birthday to Us: Honeycomb 10 Year Manifesto, Part 1

Christine and I started Honeycomb in 2016, which means it’s been ten years. Christine, a developer, and I, an operations engineer, were both profoundly unhappy with the state of the art in monitoring and logging tools. The tools we had used at Facebook didn’t spray our signals around to a bunch of siloed-off pillars. They consolidated as much context as possible so we could properly explore it, the way every other non-software engineering team already takes for granted.

How Honeycomb Supercharges OpenTelemetry for AI

It has become common knowledge that the nature of software development has changed as AI-code generation and agent-based features gain adoption. In perhaps a more subtle shift, the fundamentals of software instrumentation are changing too. As OpenTelemetry becomes the standard instrumentation layer across enterprises, with thousands of developers (many from Honeycomb) actively contributing to it, the nature of the telemetry data captured itself is evolving to meet the growing demand for rich context.

AI in Production Is Growing Faster Than We Can Trust it

Enterprise software has moved past the generative AI testing phase. Businesses with millions of daily users or workloads are no longer just prototyping LLMs in a vacuum. They’re directly wiring agentic efficiency into product interfaces and infrastructure to stay competitive. This wave is often compared to the spread of microservices in the past, but we aren’t just adding new dependencies and complexity.

Measuring Claude Code ROI and Adoption in Honeycomb

At Honeycomb, we’ve been using Claude Code across our engineering team for a while. Anecdotally, I had a sense of who the power users were, and I had seen some examples of complex usage. But I wanted to be able to confidently answer questions, like: Claude Code supports OpenTelemetry out of the box, which means sending telemetry to Honeycomb takes just a few minutes of configuration.

Observability with AI? Honeycomb with AI!

Since Honeycomb started, it has had a weakness: too many choices. Every field, custom or standard, hundreds of them, all are free to group, filter, and visualize in dozens of ways. Which ones are interesting? Honeycomb exists to help people understand custom software. It doesn’t pretend to know what matters in your application. That’s an interpretive task, not programmatic. Hey, computers can do interpretation now!

"You Had One Job": Why Twenty Years of DevOps Has Failed to Do it

Let’s start with a question. What is DevOps all about? I’ll tell you my answer. In retrospect, I think the entire DevOps movement was a mighty, twenty year battle to achieve one thing: a single feedback loop connecting devs with prod. On those grounds, it failed. Not because software engineers weren’t good at their jobs, or didn’t care enough. It failed because the technology wasn’t good enough.

OpAMP Explained: Why OpenTelemetry Needed an Agent Management Protocol (and How We Use It)

OpenTelemetry makes it easy to produce and transmit any type of telemetry. In production environments, this often means deploying the OpenTelemetry Collector as an intermediary to process, enrich, and route telemetry data. As systems scale, so does this infrastructure—sometimes to hundreds or thousands of Collectors spread across environments.

Reporting Exceptions to Honeycomb with Frontend Observability

So you've built a client application and you've started sending telemetry. The information sent back by this client is vital to you, and one of the first things you care about is capturing and reporting errors. There are at least two ways to report error details in OpenTelemetry. Web applications generally place exceptions in trace spans as span events, and mobile applications send exceptions as log messages instead.