OpenTelemetry 2022 Holiday Goodie Bag
We here at Honeycomb really like OpenTelemetry and goodie bags, so we have a nice little OpenTelemetry-flavored holiday goodie bag to share with you before you’re off for the holidays!
We here at Honeycomb really like OpenTelemetry and goodie bags, so we have a nice little OpenTelemetry-flavored holiday goodie bag to share with you before you’re off for the holidays!
The line from observability to customer joy is straighter than you think. We recently learned this from NS1, a managed DNS provider and Honeycomb customer, in a panel discussion with Nate Daly, Head of Architecture at NS1 and Chris Bertinato, Software Architect at NS1.
Bugs can remain dormant in a system for a long time, until they suddenly manifest themselves in weird and unexpected ways. The deeper in the stack they are, the more surprising they tend to be. One such bug reared its head within our columnar datastore in May this year, but had been present for more than two years before detection.
Today, we’re announcing the expansion of Honeycomb integrations with various AWS services. This update now covers a much wider swath of AWS services, makes it easier to integrate your AWS stack with Honeycomb, and with our new BubbleUp enhancements, you’ll be identifying and debugging hidden issues in your AWS stack faster than ever.
I joined Honeycomb as a Staff Site Reliability Engineer (SRE) midway through September, and it’s been a wild ride so far. One thing I was especially excited about was the opportunity to see Honeycomb’s incident retrospective process from the inside. I wasn’t disappointed! The first retrospective I took part in was for our ingestion delays incident on September 8th.
With the introduction of Environments & Services, we’ve seen a dramatic increase in the creation of new datasets. These new datasets are smaller than ones created with Honeycomb Classic, where customers would typically place all of their services under a single, large dataset. This change has presented some interesting scaling challenges, which I’ll detail in this post, along with the solution we used, and how we leveraged Honeycomb’s own telemetry to scale Honeycomb.
Intercom’s mission is to build better communication between businesses and their customers. With that in mind, they began their journey away from metrics alone and towards complete observability. The first step was tooling, and they learned quickly that trying to work with multiple solutions was not the answer.
If you’re writing software today, then you likely use a CI/CD pipeline to build and test your code before deploying it to production. Having a fast and efficient build pipeline saves you development time, shortens feedback loops, and helps you ship features faster. Conversely, slow and unreliable build pipelines are full of lost productivity and sadness.
There are a ton of leaves in my yard, and I’m slowly coming out of a weeklong sugar coma. That can only mean October has come and gone. Let’s take a peek at what new and noteworthy changes Honeycomb has made since we last checked in.
One of the things that struck me upon joining Honeycomb was the seemingly laissez-faire approach we took towards internal SLOs. From my own research (beginning with the classic SRE book, following Google’s example), I came to these conclusions: If you read the original SRE book when it was released, before the workbook came out, these conclusions all made sense.