As the first part of a three-part series on Apache Kafka monitoring, this article explores which Kafka metrics are important to monitor and why. When monitoring Kafka, it’s important to also monitor ZooKeeper as Kafka depends on it. The second part will cover Kafka open source monitoring tools, and identify the tools and techniques you need to further help monitor and administer Kafka in production.
Open source software adoption continues to grow within enterprises (even for legacy applications), beyond just startups and born-in-the-cloud software. In this second part of our Kafka monitoring series (see the first part discussing Kafka metrics to monitor), we’ll take a look at some open source tools available to monitor Kafka clusters. We’ll explore what it takes to install, configure, and actually use each tool in a meaningful way.
Monitoring Kafka is a tricky task. As you can see in the first chapter, Kafka Key Metrics to Monitor, the setup, tuning, and operations of Kafka require deep insights into performance metrics such as consumer lag, I/O utilization, garbage collection and many more. Sematext provides an excellent alternative to other Kafka monitoring tools because it’s quick and simple to use.
Over the last six months, Opsgenie’s customer base has expanded significantly. We’ve become the tool of choice for teams that are new to operating always-on services, as well as those who have been left disappointed by alternative solutions. We can claim many advantages over our competition, but here are the top ten reasons Dev and Ops teams are choosing Opsgenie.
When I started Checkly, all the typical SaaS things around billing, credit cards and prorating confused the hell out of me. I understood them from an intellectual point of view, but not really from an implementation point of view.
It is 5 a.m. Tuesday. The ETL job that populates revenue data into your organization’s data warehouse fails midway through the process. When the CFO opens the mobile dashboard to review the last day’s results, he immediately notices that the data is wrong – again. For a few hours, the on-call ETL Architect determines what caused the data-load failure, fixes the issue, and restarts/monitors the job until it successfully completes.
Our recent webinar on Stop Swivel-Chair IT Operations with OpsRamp and ServiceNow ITSM featured Curt Thorin, Solutions Strategist and Jordan Sher, Director of Corporate Marketing. The webinar addressed the challenge of managing alerts and remediating incidents at scale and how the right automation and ITSM integration investments (powered by AIOps) are helping enterprises address the problems of alert storms and service degradations.
SaltStack is an open source configuration management tool that lets you manage your infrastructure as code. Using SaltStack, you can manage tens of thousands of servers remotely using either a declarative language or imperative commands. It’s similar to configuration management tools such as Puppet, Chef, and Ansible.