When you send telemetry into Honeycomb, our infrastructure needs to buffer your data before processing it in our “retriever” columnar storage database. For the entirety of Honeycomb’s existence, we have used Apache Kafka to perform this buffering function in our observability pipeline.
In a perfect world, small pieces of software scale up and down with a high-performance message broker like a heart, pumping data in a controlled, efficient manner. However, at some point, in every single system, an application starts pumping out large data requests, in a sporadic, uncontrollable manner. Because here, we are talking about the real world, right? In this article, we will address the problem of adapting a legacy system to play with a stream-based platform.
Enterprises often have several servers, firewalls, databases, mobile devices, API endpoints, and other infrastructure that powers their IT. Because of this, organizations must provide resources to manage logged events across the environment. Logging is a factor in detecting and blocking cyber-attacks, and organizations use log data for auditing during an investigation after an incident. Brokers, such as Apache Kafka, will ingest logging data in real-time, process, store, and route data.