Operations | Monitoring | ITSM | DevOps | Cloud

November 2022

New Honeycomb Integrations Let You Bubble Up Lurking AWS Issues

Today, we’re announcing the expansion of Honeycomb integrations with various AWS services. This update now covers a much wider swath of AWS services, makes it easier to integrate your AWS stack with Honeycomb, and with our new BubbleUp enhancements, you’ll be identifying and debugging hidden issues in your AWS stack faster than ever.

The Incident Retrospective Ground Rules

I joined Honeycomb as a Staff Site Reliability Engineer (SRE) midway through September, and it’s been a wild ride so far. One thing I was especially excited about was the opportunity to see Honeycomb’s incident retrospective process from the inside. I wasn’t disappointed! The first retrospective I took part in was for our ingestion delays incident on September 8th.

Scaling Ingest With Ingest Telemetry

With the introduction of Environments & Services, we’ve seen a dramatic increase in the creation of new datasets. These new datasets are smaller than ones created with Honeycomb Classic, where customers would typically place all of their services under a single, large dataset. This change has presented some interesting scaling challenges, which I’ll detail in this post, along with the solution we used, and how we leveraged Honeycomb’s own telemetry to scale Honeycomb.

Customer Story: Intercom Reduces MTTWTF With Observability and Distributed Tracing

Intercom’s mission is to build better communication between businesses and their customers. With that in mind, they began their journey away from metrics alone and towards complete observability. The first step was tooling, and they learned quickly that trying to work with multiple solutions was not the answer.

Announcing New CircleCI + Honeycomb Integration Guide

If you’re writing software today, then you likely use a CI/CD pipeline to build and test your code before deploying it to production. Having a fast and efficient build pipeline saves you development time, shortens feedback loops, and helps you ship features faster. Conversely, slow and unreliable build pipelines are full of lost productivity and sadness.

Touching Grass With SLOs

One of the things that struck me upon joining Honeycomb was the seemingly laissez-faire approach we took towards internal SLOs. From my own research (beginning with the classic SRE book, following Google’s example), I came to these conclusions: If you read the original SRE book when it was released, before the workbook came out, these conclusions all made sense.

Monitoring Cloud Database Costs with OpenTelemetry and Honeycomb

In the last few years, the usage of databases that charge by request, query, or insert—rather than by provisioned compute infrastructure (e.g., CPU, RAM, etc.)—has grown significantly. They’re popular for a lot of the same reasons that serverless compute functions are, as the cost will scale with your usage. No one is using your site? No problem: you’re not charged.