Operations | Monitoring | ITSM | DevOps | Cloud

Kafka

Create Kafka Topics in 3 Easy Steps

Creating a topic in production is an operative task that requires awareness and preparation. In this tutorial, we’ll explain all the parameters to consider when creating a new topic in production. Setting the partition count and replication factor is required when creating a new Topic and the following choices affect the performance and reliability of your system.

Deploying Kafka with the ELK Stack

Logs are unpredictable. Following a production incident, and precisely when you need them the most, logs can suddenly surge and overwhelm your logging infrastructure. To protect Logstash and Elasticsearch against such data bursts, users deploy buffering mechanisms to act as message brokers. Apache Kafka is the most common broker solution deployed together the ELK Stack.

NYC Kafka Meetup: How To Rewind The New York Times Homepage and Capacity Planning at Datadog

Datadog recently hosted the NYC Kafka Meetup. Presenters included Jamie Alquiza (Datadog), Stephen Dotz (NY Times) and Michael Kaminski (NY Times). Jamie shared how Datadog conducts capacity planning for Kafka, and the NY Times team shared how their publishing pipeline works.

Kafka Metrics to Monitor

As the first part of a three-part series on Apache Kafka monitoring, this article explores which Kafka metrics are important to monitor and why. When monitoring Kafka, it’s important to also monitor ZooKeeper as Kafka depends on it. The second part will cover Kafka open source monitoring tools, and identify the tools and techniques you need to further help monitor and administer Kafka in production.

Kafka Open Source Monitoring Tools

Open source software adoption continues to grow within enterprises (even for legacy applications), beyond just startups and born-in-the-cloud software. In this second part of our Kafka monitoring series (see the first part discussing Kafka metrics to monitor), we’ll take a look at some open source tools available to monitor Kafka clusters. We’ll explore what it takes to install, configure, and actually use each tool in a meaningful way.

Monitoring Kafka with Sematext

Monitoring Kafka is a tricky task. As you can see in the first chapter, Kafka Key Metrics to Monitor, the setup, tuning, and operations of Kafka require deep insights into performance metrics such as consumer lag, I/O utilization, garbage collection and many more. Sematext provides an excellent alternative to other Kafka monitoring tools because it’s quick and simple to use.

Kafka Logging with the ELK Stack

Kafka and the ELK Stack — usually these two are part of the same architectural solution, Kafka acting as a buffer in front of Logstash to ensure resiliency. This article explores a different combination — using the ELK Stack to collect and analyze Kafka logs. As explained in a previous post, Kafka plays a key role in our architecture. As such, we’ve constructed a monitoring system to ensure data is flowing through the pipelines as expected.

Apache Kafka Tutorial - Use Cases & Challenges Logging at Scale

Organizations that handle logging at scale eventually run into the same problem: too many events are being generated, and logging components can’t keep up. Even with persistent queues and other mitigating features enabled, there’s simply not enough of a buffer between log generators and log ingesters to handle the volume of log lines coming in.

Monitoring Apache Spark applications running on Amazon EMR

We recently implemented a Spark streaming application, which consumes data from from multiple Kafka topics. The data consumed from Kafka comprises different types of telemetry events generated by mobile devices. We decided to host the Spark cluster using the Amazon EMR service, which manages a fleet of EC2 instances to run our data-processing pipelines.